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To Jeremy Butterfield 


Preface 


‘Der Kopf, so gesehen, hat mit dem Kopf, so gesehen, auch nicht die leiseste Ahnlichkeit 
(...) Der Aspektwechsel. “Du wiirdest doch sagen, dass sich das Bild jetzt ganzlich 
geandert hat!” Aber was ist anders: mein Eindruck? meine Stellungnahme? (...) Ich 
beschreibe die Anderung wie eine Wahrnehmung, ganz, als hatte sich der Gegenstand vor 
meinen Augen geiindert.’ (Wittgenstein, Philosophische Untersuchungen Ul, §§127, 129).! 


As the well-known picture above is meant to allegorize, some physical systems 
admit a dual description in either classical or quantum-mechanical terms. According 
to Bohr’s “doctrine of classical concepts’, measurement apparatuses are examples 
of such systems. More generally—as hammered down by decoherence theorists— 
the classical world around us is a case in point. As will be argued in this book, the 
measurement problem of quantum mechanics (highlighted by Schrddinger’s Cat) is 
caused by this duality (rather than resolved by it, as Bohr is said to have thought). 


' «The head seen in this way hasn’t even the slightest similarity to the head seen in that way (...) 
The change of aspect. “But surely you’d say that the picture has changed altogether now! But what 
is different: my impression? my attitude? (...) I describe the change like a perception; just as if the 
object has changed before my eyes.’ Translation: G.E.M. Anscombe, P.M.S. Hacker, & J. Schulte 
(Wittgenstein, 2009/1953, pp. 205-206). 


Vii 


Viii Preface 


The aim of this book is to analyze the foundations of quantum theory from the 
point of view of classical-quantum duality, using the mathematical formalism of 
operator algebras on Hilbert space (and, more generally, C*-algebras) that was orig- 
inally created by von Neumann (followed by Gelfand and Naimark). In support of 
this analysis, but also as a matter of independent interest, the book covers many of 
the traditional topics one might expect to find in a treatise on the foundations of 
quantum mechanics, like pure and mixed states, observables, the Born rule and its 
relation to both single-case probabilities and long-run frequencies, Gleason’s Theo- 
rem, the theory of symmetry (including Wigner’s Theorem and its relatives, culmi- 
nating in a recent theorem of Hamhalter’s), Bell’s Theorem(s) and the like, quantiza- 
tion theory, indistinguishable particle, large systems, spontaneous symmetry break- 
ing, the measurement problem, and (intuitionistic) quantum logic. One also finds 
a few idiosyncratic themes, such as the Kadison—Singer Conjecture, topos theory 
(which naturally injects intuitionism into quantum logic), and an unusual emphasis 
on both conceptual and mathematical aspects of limits in physical theories. 

All of this is held together by what we call Bohrification, i.e., the mathematical 
interpretation of Bohr’s classical concepts by commutative C*-algebras, which in 
turn are studied in their quantum habitat of noncommutative C*-algebras. 

Thus the book is mostly written in mathematical physics style, but its real subject 
is natural philosophy. Hence its intended readership consists not only of mathemati- 
cal physicists, but also of philosophers of physics, as well as of theoretical physicists 
who wish to do more than ‘shut up and calculate’, and finally of mathematicians who 
are interested in the mathematical and conceptual structure of quantum theory. 

To serve all these groups, the native mathematical language (i.e. of C*-algebras) 
is introduced slowly, starting with finite sets (as classical phase spaces) and finite- 
dimensional Hilbert spaces. In addition, all advanced mathematical background that 
is necessary but may distract from the main development is laid out in extensive 
appendices on Hilbert spaces, functional analysis, operator algebras, lattices and 
logic, and category theory and topos theory, so that the prerequisites for this book 
are limited to basic analysis and linear algebra (as well as some physics). These 
appendices not only provide a direct route to material that otherwise most readers 
would have needed to extract from thousands of pages of diverse textbooks, but they 
also contain some original material, and may be of interest even to mathematicians. 

In summary, the aims of this book are similar to those of its peerless paradigm: 


‘Der Gegenstand dieses Buches ist die einheitliche, und, soweit als moglich und angebracht, 
mathematisch einwandfreie Darstellung der neuen Quantenmechanik (...). Dabei soll das 
Hauptgewicht auf die allgemeinen und prinzipiellen Fragen, die im Zusammenhange mit 
dieser Theorie entstanden sind, gelegt werden. Insbesondere sollen die schwierigen und 
vielfach noch immer nicht restlos geklarten Interpretationsfragen naher untersucht werden.’ 
(von Neumann, Mathematische Grundlagen der Quantenmechanik, 1932, p. 1)2 


? ‘The object of this book is to present the new quantum mechanics in a unified presentation which, 
so far as it is possible and useful, is mathematically rigorous. (...) Therefore the principal emphasis 
shall be placed on the general and fundamental questions which have arisen in connection with this 
theory. In particular, the difficult problems with interpretation, many of which are even now not 
fully resolved, will be investigated in detail.” Translation: R.T. Beyer (von Neumann, 1955, p. vii). 


Preface ix 


Two other quotations the author often had in mind while writing this book are: 


‘And although the whole of philosophy is not immediately evident, still it is better to add 
something to our knowledge day by day than to fill up men’s minds in advance with the 
preconceptions of hypotheses.’ (Newton, draft preface to Principia, 1686). 


‘Juist het feit dat een genie als DESCARTES volkomen naast de lijn van ontwikkeling is bli- 
jven staan, die van GALILEI naar NEWTON voert (...) [is] een phase van den in de historie 
zoo vaak herhaalden strijd tusschen de bescheidenheid der mathematisch-physische meth- 
ode, die na nauwkeurig onderzoek de verschijnselen der natuur in steeds meer omvattende 
schemata met behulp van de exacte taal der mathesis wil beschrijven en den hoogmoed van 
het philosophische denken, dat in één genialen greep de heele wereld wil omvatten (...).’ 
(Dijksterhuis, Val en Worp, 1924, p. 343).4 
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Introduction 


After 25 years of confusion and even occasional despair, in March 1926 physicists 
suddenly had two theories of the microscopic world (Heisenberg, 1925; Schrédinger, 
1926ab), which hardly could have looked more differently. Heisenberg’s matrix me- 
chanics (as it came to be called a bit later) described experimentally measurable 
quantities (i.e., “observables”) in terms of discrete quantum numbers, and appar- 
ently lacked a state concept. Schrédinger’s wave mechanics focused on unobserv- 
able continuous matter waves apparently playing the role of quantum states; at the 
time the only observable within reach of his theory was the energy. Einstein is even 
reported to have remarked in public that the two theories excluded each other. 
Nonetheless, Pauli (in a letter to Jordan dated 12 April 1926), Schrdédinger 
(1926c) himself, Eckart (1926), and Dirac (1927) argued—it is hard to speak of 
a complete argument even at a heuristic level, let alone of a mathematical proof 
(Muller, 1997ab)— that in fact the two theories were equivalent! A rigorous equiv- 
alence proof was given by von Neumann (1927ab), who (at the age of 23) was the 
first to unearth the mathematical structure of quantum mechanics as we still under- 
stand it today. His effort, culminating in his monograph Mathematische Grundlagen 
der Quantenmechanik (von Neumann, 1932), was based on the abstract concept of 
a Hilbert space, which previously had only appeared in examples (i.e. specific real- 
izations) going back to the work of Hilbert and his school on integral equations. 
The novelty of von Neumann’s abstract approach may be illustrated by the advice 
Hilbert’s former student Schmidt gave to von Neumann even at the end of the 1920s: 


‘Nein! Nein! Sagen Sie nicht Operator, sagen Sie Matrix!” (Bernkopf, 1967, p. 346).° 


Von Neumann proposed that observables quantities be interpreted as (possibly un- 
bounded) self-adjoint operators on some Hilbert space, whilst pure states are real- 
ized as rays (i.e. unit vectors up to a phase) in the same space; finally, the inner prod- 
uct provides the probabilities introduced by Born (1926ab). In particular, Heisen- 
berg’s observables were operators on ?(N ), whereas Schrédinger’s wave-functions 
were unit vectors in L7(R*). A unitary transformation between these Hilbert spaces 
then provided the mathematical equivalence between their competing theories. 


> ‘No! No! You shouldn’t say operator, you should say matrix!’ 
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This story is well known, but it is worth emphasizing (cf. Zalamea, 2016, 81.1) 
that the most significant difference between von Neumann’s mathematical axiom- 
atization of quantum mechanics and Dirac’s heuristic but beautiful and systematic 
treatment of the same theory (Dirac, 1930) was not so much the lack of mathemat- 
ical rigour in the latter—although this point was stressed by von Neumann (1932, 
p. 2) himself, who was particularly annoyed with Dirac’s 5-function and his closely 
related assumption that every self-adjoint operator can be diagonalized in the naive 
way of having a basis of eigenvectors—but the fact that Dirac’s approach was rela- 
tive to the choice of a (generalized) basis of a Hilbert space, whereas von Neumann’s 
was absolute. In this sense, as a special case of his (and Jordan’s) general transfor- 
mation theory, Dirac showed that Heisenberg’s matrix mechanics and Schrédinger’s 
wave mechanics were related by a (unitary) transformation, whereas for von Neu- 
mann they were two different realizations of his abstract (separable) Hilbert space. 
In particular, von Neumann’s approach a priori dispenses with a basis choice alto- 
gether; this is precisely the difference between an operator and a matrix Schmidt al- 
luded to in the above quotation. Indeed, von Neumann’s abstract approach (which as 
a co-founder of functional analysis he shared with Banach, but not with his mentor 
Hilbert) was remarkable even in mathematics; in physics it must have been dazzling. 

It is instructive to compare this situation with special relativity, where, so to 
speak, Dirac would write down the theory in terms of inertial frames of reference, 
so as to subsequently argue that due to Poincaré-invariance the physical content of 
the theory does not depend on such a choice. Von Neumann, on the other hand (had 
he ever written a treatise on relativity), would immediately present Minkowski’s 
space-time picture of the theory and develop it in a coordinate-free fashion. 

However, this analogy is also misleading. In special relativity, all choices of iner- 
tial frames are genuinely equivalent, but in quantum mechanics one often does have 
preferred observables: as Bohr would argue from his Como Lecture in 1927 onwards 
(Bohr, 1928), these observables are singled out by the choice of some experimental 
context, and they are jointly measurable iff they commute (see also below). Though 
not necessarily developed with Bohr’s doctrine in mind, Dirac’s approach seems 
tailor-made for this situation, since his basis choice is equivalent to a choice of 
“preferred” physical observables, namely those that are diagonal in the given basis 
(for Heisenberg this was energy, while for Schrédinger it was position). 

Von Neumann’s abstract approach can deal with preferred observables and ex- 
perimental contexts, too, though the formalism for doing so is more demanding. 
Namely, for reasons ranging from quantum theory to ergodic theory via unitary 
group representations on Hilbert space, from 1930 onwards von Neumann devel- 
oped his theory of “rings of operators” (nowadays called von Neumann algebras), 
partly in collaboration with his assistant Murray (von Neumann, 1930, 1931, 1938, 
1940, 1949; Murray & von Neumann, 1936, 1937, 1943). For us, at least at the 
moment the point is that Dirac’s diagonal observables are formalized by maximal 
commutative von Neumann algebras A on some Hilbert space. These often come 
naturally with some specific realization of a Hilbert space; for example, on Heisen- 
berg’s Hilbert space ¢?(N) on has Ay = &”(N), while Schrédinger’s L? (IR?) is host 
to A, = L®(R3), both realized as multiplication operators (cf. Proposition B.73). 
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Although the second (1931) paper in the above list shows that von Neumann was 
well aware of the importance of the commutative case of his theory of operator al- 
gebras, he—perhaps deliberately—amissed the link with Bohr’s ideas. As explained 
in the remainder of this Introduction, providing this link is one of the main themes 
of this book, but we will do so using the more powerful formalism of C*-algebras. 
Introduced by Gelfand & Naimark (1943), these are abstractions and generaliza- 
tions of von Neumann algebras, so abstract indeed that Hilbert spaces are not even 
mentioned in their definition. Nonetheless, C*-algebras remain very closely tied to 
Hilbert spaces through the GNS-construction originating with Gelfand & Naimark 
(1943) and Segal (1947b), which implies that any C*-algebra is isomorphic to a 
well-behaved algebra of bounded operators on some Hilbert space (see §C.12). 

Starting with Segal (1947a), C*-algebras have become an important tool in math- 
ematical physics, where traditionally most applications have been to quantum sys- 
tems with infinitely many degrees of freedom, such as quantum statistical mechan- 
ics in infinite volume (Ruelle, 1969; Israel, 1979; Bratteli & Robinson, 1981; Haag, 
1992; Simon, 1993) and quantum field theory (Haag, 1992; Araki, 1999). 

Although we delve from the first body of literature, and were at least influenced 
by the second, the present book employs C*-algebras in a rather different fashion, 
in that we exploit the unification they provide of the commutative and the noncom- 
mutative “worlds” into a single mathematical framework (where one should note 
that as far as physics is concerned, the commutative or classical case is not purely 
C*-algebraic in character, because one also needs a Poisson structure, see Chapter 
3). This unified language (supplemented by some category theory, group(oid) the- 
ory, and differential geometry) gives a mathematical handle on Wittgenstein’s As- 
pektwechsel between classical and quantum-mechanical modes of description (see 
Preface), which in our view lies at the heart of the foundations of quantum physics. 
This “change of perspective’, which roughly speaking amounts to switching (and 
interpolating) between commutative and noncommutative C*-algebras, is added to 
Dirac’s transformation theory (which comes down to switching between generalized 
bases, or, equivalently, between maximal commutative von Neumann algebras). 

The central conceptual importance of the Aspektwechsel for this book in turn 
derives from our adherence to Bohr’s doctrine of classical concepts, which forms 
part of the Copenhagen Interpretation of quantum mechanics (here defined strictly 
as a body of ideas shared by Bohr and Heisenberg). We let the originators speak: 


‘It is decisive to recognize that, however far the phenomena transcend the scope of classical 
physical explanation, the account of all evidence must be expressed in classical terms. The 
argument is simply that by the word experiment we refer to a situation where we can tell 
others what we have done and what we have learned and that, therefore, the account of 
the experimental arrangements and of the results of the observations must be expressed in 
unambiguous language with suitable application of the terminology of classical physics.’ 
(Bohr, 1949, p. 209) 


‘The Copenhagen interpretation of quantum theory starts from a paradox. Any experiment 
in physics, whether it refers to the phenomena of daily life or to atomic events, is to be 
described in the terms of classical physics. The concepts of classical physics form the lan- 
guage by which we describe the arrangement of our experiments and state the results. We 
cannot and should not replace these concepts by any others.’ (Heisenberg 1958, p. 44) 
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The last quotation even opens Heisenberg’s only systematic presentation of the 
Copenhagen Interpretation, which forms Chapter 111 of his Gifford Lectures from 
1955; apparently this was the first occasion where the name “Copenhagen Interpre- 
tation” was used (Howard, 2004). In our view, several other defining claims of the 
Copenhagen Interpretation appear to be less well founded, if not unwarranted, al- 
though they may have been understandable in the historical context where they were 
first proposed (in which the new theory of quantum mechanics needed to get going 
even in the face of the foundational problems that all of the originators—including 
Bohr and Heisenberg—were keenly aware of). These spurious claims include: 


e The emphatic rejection of the possibility to analyze what is going on during mea- 
surements, as expressed in typical Bohr parlance by claims like: 


‘According to the quantum theory, just the impossibility of neglecting the interaction 
with the agency of measurement means that every observation introduces a new uncon- 
trollable element.’ (Bohr, 1928, p. 584), 


or, with similar (but somehow less off-putting) dogmatism by Heisenberg: 


‘So we cannot completely objectify the result of an observation’ (1958, p. 50). 


e The closely related interpretation of quantum-mechanical states (which Heisen- 
berg indeed referred to as “probability functions’) as mere catalogues of the prob- 
abilities attached to possible outcomes of experiments, as in: 


‘what one deduces from observation is a probability function, a mathematical expression 
that combines statements about possibilities or tendencies with statements about our 
knowledge of facts’ (Heisenberg 1958, p. 50), 


In addition, there are two ingredients of the avowed Copenhagen Interpretation Bohr 
and Heisenberg actually seem to have disagreed about. These include: 


e The collapse of the wave-function (i.e., upon completion of a measurement), 
which was introduced by Heisenberg (1927) in his paper on the uncertainty rela- 
tions. As we shall see in Chapter 11, this idea was widely adopted by the pioneers 
of quantum mechanics (and it still is), but apparently it was never endorsed by 
Bohr, who saw the wave-function as a “symbolic” expression (cf. Dieks, 2016a). 

e Bohr’s doctrine of Complementarity, which—though never precisely articulated— 
he considered to be a revolutionary philosophical insight of central importance to 
the interpretation of quantum mechanics (and even beyond). Heisenberg, on the 
other hand, regarded complementary descriptions (which Bohr saw as incompat- 
ible) as mathematically equivalent and at best paid lip-service to the idea. The 
reason for this discord probably lies in the fact that Heisenberg was typically 
guided by (quantum) theory, whereas Bohr usually started from experiments; 
Heisenberg once even referred to his mentor as a ‘philosopher of experiment’. 
Therefore, Heisenberg was satisfied that for example position and momentum 
were related by a unitary operator (i.e. the Fourier transform), whereas Bohr had 
the incompatible experimental arrangements in mind that were required to mea- 
sure these quantities. Their difference, then, contrasted theory and experiment. 
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Let us now review the philosophical motivation Bohr and Heisenberg gave for their 
mutual doctrine of classical concepts. First, Bohr (in his typical convoluted prose): 


‘The elucidation of the paradoxes of atomic physics has disclosed the fact that the unavoid- 
able interaction between the objects and the measuring instruments sets an absolute limit 
to the possibility of speaking of a behavior of atomic objects which is independent of the 
means of observation. We are here faced with an epistemological problem quite new in nat- 
ural philosophy, where all description of experience has so far been based on the assump- 
tion, already inherent in ordinary conventions of language, that it is possible to distinguish 
sharply between the behavior of objects and the means of observation. This assumption 
is not only fully justified by all everyday experience but even constitutes the whole basis 
of classical physics. (...) As soon as we are dealing, however, with phenomena like indi- 
vidual atomic processes which, due to their very nature, are essentially determined by the 
interaction between the objects in question and the measuring instruments necessary for 
the definition of the experimental arrangement, we are, therefore, forced to examine more 
closely the question of what kind of knowledge can be obtained concerning the objects. In 
this respect, we must, on the one hand, realize that the aim of every physical experiment— 
to gain knowledge under reproducible and communicable conditions—leaves us no choice 
but to use everyday concepts, perhaps refined by the terminology of classical physics, not 
only in all accounts of the construction and manipulation of the measuring instruments but 
also in the description of the actual experimental results. On the other hand, it is equally 
important to understand that just this circumstance implies that no result of an experiment 
concerning a phenomenon which, in principle, lies outside the range of classical physics 
can be interpreted as giving information about independent properties of the objects.’ 


This text has been taken from Bohr (1958, p. 25), but very similar passages appear 
in many of Bohr’s writings from his famous Como Lecture (Bohr, 1928) onwards. 
In other words, the (supposedly) unavoidable interaction between the objects and 
the measuring instruments, which for Bohr represents the characteristic feature of 
quantum mechanics (and which we would now express in terms of entanglement, 
of which concept Bohr evidently had an intuitive grasp), threatens the objectivity 
of the description that is characteristic of (if not the defining property of) of classi- 
cal physics. However, this threat can be countered by describing quantum mechanics 
through classical physics, which (or so the argument goes) restores objectivity. Else- 
where, we see Bohr also insisting on the need for classical concepts in defining any 
meaningful theory whatsoever, as these are the only concepts we really understand 
(though, as he always insists, classical concepts are at the same time challenged by 
quantum theory, as a consequence of which their use is necessarily limited). 
Although Heisenberg’s arguments for the necessity of classical concepts start 
similarly, they eventually take a conspicuously different direction from Bohr’s: 


‘To what extent, then, have we finally come to an objective description of the world, espe- 
cially of the atomic world? In classical physics science started from the belief—or should 
one say from the illusion?—that we could describe the world or at least parts of the world 
without any reference to ourselves. This is actually possible to a large extent. We know that 
the city of London exists whether we see it or not. It may be said that classical physics 
is just that idealization in which we can speak about parts of the world without any ref- 
erence to ourselves. Its success has led to the general ideal of an objective description of 
the world. Objectivity has become the first criterion for the value of any scientific result. 
Does the Copenhagen interpretation of quantum theory still comply with this ideal? One 
may perhaps say that quantum theory corresponds to this ideal as far as possible. Certainly 
quantum theory does not contain genuine subjective features, it does not introduce the mind 
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of the physicist as a part of the atomic event. But it starts from the division of the world 
into the object and the rest of the world, and from the fact that at least for the rest of the 
world we use the classical concepts in our description. This division is arbitrary and his- 
torically a direct consequence of our scientific method; the use of the classical concepts is 
finally a consequence of the general human way of thinking. But this is already a reference 
to ourselves and in so far our description is not completely objective. (...) 


The concepts of classical physics are just a refinement of the concepts of daily life and are 
an essential part of the language which forms the basis of all natural science. Our actual 
situation in science is such that we do use the classical concepts for the description of the 
experiments, and it was the problem of quantum theory to find theoretical interpretation of 
the experiments on this basis. There is no use in discussing what could be done if we were 
other beings than we are. (...) 


Natural science does not simply describe and explain nature; it is a part of the interplay 
between nature and ourselves; it describes nature as exposed to our method of questioning.’ 
(Heisenberg, 1958, p. 55-56, 56, 81) 


The well-known last part may indeed have been the source of the crucial ‘I’m the 
one who knocks’ episode in the superb tv-series Breaking Bad (whose criminal main 
character operates under the cover name of “Heisenberg’’). This is worth mentioning 
here, because Heisenberg (and to a lesser extent also Bohr) displays a puzzling 
mixture between the hubris of claiming that quantum mechanics has restored Man’s 
position at the center of the universe and the modesty of recognizing that nonetheless 
Man has to know his limitations (in necessarily relying on the classical concepts he 
happens to be familiar with at the current state of evolution and science). 

Our own reasons for favoring the doctrine of classical concepts are threefold. 
The first is closely related to Heisenberg’s and may be expressed even better by the 
following passage from a book by the renowned Dutch primatologist Frans de Waal: 


‘Die Verwandlung [1.e., The Metamorphosis by Franz Kafka, in which Gregor Samsa fa- 
mously wakes up to find himself transformed into an insect], published in 1915, was an 
unusual take-off for a century in which anthropocentrism declined. For metaphorical rea- 
sons, the author had picked a repulsive creature, forcing us from the first page onwards to 
feel what it would be like to be an insect. Around the same time, the German biologist 
Jakob von Uexkiill drew attention to the fact that each particular species has its own per- 
spective, which he called its Umwelt. To illustrate this new idea, Uexkiill took his readers 
on a tour through the worlds of various creatures. Each organism observes its environment 
in its own peculiar way, he argued. A tick, which has no eyes, climbs onto a grass blade, 
where it awaits the scent of butyric acid off the skin of mammals that pass by. Experiments 
have demonstrated that ticks may survive without food for as long as 18 years, so that a tick 
has ample time to wait for her prey, jump on it, and suck its warm blood, after which she 
is ready to lay her eggs and die. Are we in a position to understand the Umwelt of a tick? 
Its seems unbelievably poor compared to ours, but Uexkiill regarded its simplicity rather as 
a strength: ticks have set themselves a narrow goal and hence cannot easily be distracted. 
Uexkiill analysed many other examples, and showed how a single environment offers hun- 
dreds of different realities, each of which is unique for some given species. (...) Some 
animals merely register ultraviolet light, others live in a world of odors, or of touch, like a 
star nose mole. Some animals sit on a branch of an oak, others live underneath the bark of 
the same oak, whilst a fox family digs a hole underneath its roots. Each animal observes the 
tree differently.’ (De Waal, 2016, pp. 15—16. Translation by the author). 


Indeed, it is hardly an accident that De Waal preceded this passage by a quotation 
from Heisenberg almost identical to the last one above. 
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A second argument in favour of the doctrine lies in the possibility of a peaceful 
outcome of the Bohr—Einstein debate, or at least of an important part of it; cf. Lands- 
man (2006a), which was inspired by earlier work of Raggio (1981, 1988) and Bac- 
ciagaluppi (1993). This debate initially centered on Einstein’s attempts to debunk 
the Heisenberg uncertainty relations, and subsequently, following Einstein’s grudg- 
ing acceptance of their validity, entered its most famous and influential phase, in 
which Einstein tried to prove that quantum mechanics, although admittedly correct, 
was incomplete. One could argue that both antagonists eventually lost this part of 
the debate, since Einstein’s goal of a local realistic (quantum) physics was quashed 
by the famous work of Bell (1964), whereas against Bohr’s views, deterministic ver- 
sions of quantum mechanics such as Bohmian mechanics and the Everett (i.e. Many 
Worlds) Interpretation turned out to be at least logical possibilities. 

However incompatible the views of Einstein and Bohr on physics and its goals 
may have been, unknown to them a common battleground did in fact exist and could 
even have led to a reconciliation of at least the epistemological views of the great ad- 
versaries. The common ground referred to concerns the problem of objectification, 
which at first sight Bohr and Einstein approached in completely different ways: 


e Bohr objectified a quantum system through the specification of a classical exper- 
imental context, i.e. by looking at it through appropriate classical glasses. 
e Einstein objectified any physical system by claiming its independent existence: 


‘The belief in an external world independent of the perceiving subject is the basis of all 
natural science.’ (Einstein, 1954, p. 266). 


On a suitable mathematical interpretation, these conditions for the objectification 
of the system turn out to be equivalent! Namely, identifying Bohr’s apparatus with 
Einstein’s perceiving subject, calling its algebra of observables A, and denoting the 
algebra of observables of the quantum system to be objectified by B, our reading of 
the doctrine of classical concepts (to be explained in more detail below) is simply 
that A be commutative. Einstein, on the other hand, insists that the system under 
observation has its own state, so that there must be no entangled states on the tensor 
product A ® B that describes the composite system. Equivalently, every pure state on 
A@®B must be a product state, so that both A and B have states that together deter- 
mine the joint state of A ® B. This is the case if and only if A or B is commutative, 
and since B is taken to be a quantum system, it must be A (see the notes to 86.5 for 
details). Thus Bohr’s objectification criterion turns out to coincide with Einstein’s! 

Thirdly, the doctrine of classical concepts describes all known applications to 
date of quantum theory to experimental physics; and therefore we simply have to 
use it if we are interested in understanding these applications. This is true for the 
entire range of empirically accessible energy and length scales, from molecular and 
condensed matter physics (including quantum computation) to high-energy physics 
(in colliders as well as in the context of astro-particle physics). So if people working 
in a field like quantum cosmology complain about the Copenhagen Interpretation 
then perhaps they should ask themselves if their field is more than a chimera. 

Given its clear empirical relevance, it is a moot point whether the doctrine of 
classical concepts is as necessary as Bohr and Heisenberg claimed it was: 
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‘In their attempts to formulate the general content of quantum mechanics, the representa- 
tives of the Copenhagen School often used formulations with which they do not merely 
say how things are in their opinion, but beyond that, they say that things must be thus and 
so (...) They chose formulations for the mere communication of an item in which at the 
same time the inevitability of what is communicated is asserted. (...) The assertion of the 
necessity of a proposition adds nothing to its content.’ (Scheibe, 2001, pp. 402-403) 


The doctrine of classical concepts implies in particular that the measuring appa- 
ratus is to be described classically; indeed, along with its coupling to the system 
undergoing measurement, it is its classical description which turns some device— 
which a priori is a quantum system like anything else—into a measuring apparatus. 
This point was repeated over and over by Bohr and Heisenberg, but in our view the 
clearest explanation of this crucial point has been given by Scheibe: 


‘It is necessary to avoid any misunderstanding of the buffer postulate [i.e., the doctrine 
of classical concepts], and in particular to emphasize that the requirement of a classical 
description of the apparatus is not designed to set up a special class of objects differing 
fundamentally from those which occur in a quantum phenomenon as the things examined 
rather than measuring apparatus. This requirement is essentially epistemological, and af- 
fects this object only in its role as apparatus. A physical object which may act as apparatus 
may in principle also be the thing examined. (...) The apparatus is governed by classical 
physics, the object by the quantum-mechanical formalism.’ (Scheibe, 1973, p. 24-25) 


Thus it is essential to the Copenhagen Interpretation that one can describe at least 
some quantum-mechanical devices classically: those for which this is possible in- 
clude the candidate-apparatuses (i.e. measuring devices). In view of its importance 
for their interpretation of quantum mechanics, it is remarkable how little Bohr, 
Heisenberg, and their followers did to seriously address this problem of a dual de- 
scription of at least part of the world, although they were clearly aware of this need: 


‘In the system to which the quantum mechanical formalism is to be applied, it is of course 
possible to include any intermediate auxiliary agency employed in the measuring process. 
Since, however, all those properties of such agencies which, according to the aim of mea- 
surements have to be compared with the corresponding properties of the object, must be 
described on classical lines, their quantum mechanical treatment will for this purpose be 
essentially equivalent with a classical description.’ (Bohr, 1939, pp. 23-24; quotation taken 
from Camilleri & Schlosshauer, 2015, p. 79) 


In defense of this alleged equivalence, we read almost circular explanations like: 


‘the necessity of basing the description of the properties and manipulation of the measur- 
ing instruments on purely classical ideas implies the neglect of all quantum effects in that 
description.’ (Bohr, 1939, p. 19) 


Since it delineates an appropriate regime, the following is slightly more informative: 


‘Incidentally, it may be remarked that the construction and the functioning of all apparatus 
like diaphragms and shutters, serving to define geometry and timing of the experimental 
arrangements, or photographic plates used for recording the localization of atomic objects, 
will depend on properties of materials which are themselves essentially determined by the 
quantum of action. Still, this circumstance is irrelevant for the study of simple atomic phe- 
nomena where, in the specification of the experimental conditions, we may to a very high 
degree of approximation disregard the molecular constitution of the measuring instruments. 


Introduction 9 


If only the instruments are sufficiently heavy compared with the atomic objects under inves- 
tigation, we can in particular neglect the requirement of the [uncertainty] relation as regards 
the control of the localization in space and time of the single pieces of the apparatus relative 
to each other. (Bohr, 1948, pp. 315-316). 


Even Heisenberg restricted himself to very general comments like: 


‘This follows mathematically from the fact that the laws of quantum theory are for the 
phenomena in which Planck’s constant can be considered as a very small quantity, approx- 
imately identical with the classical laws. (Heisenberg, 1958, pp. 57). 


Notwithstanding these vague or even circular explanations, the connection between 
classical and quantum mechanics was at the forefront of research in the early days 
of quantum theory, and even predated quantum mechanics. For example, Jammer 
(1966, p. 109) notes that already in 1906 Planck suggested that 


‘the classical theory can simply be characterized by the fact that the quantum of action 
becomes infinitesimally small.’ 


In fact, in the same context as Planck, namely his radiation formula, Einstein made 
a similar point already in 1905. Subsequently, Bohr’s Correspondence Principle, 
which originated in the context of atomic radiation, suggested an asymptotic re- 
lationship between quantum mechanics and classical electrodynamics. As such, it 
played a major role in the creation of quantum mechanics (Bohr, 1976, Jammer, 
1966, Mehra & Rechenberg, 1982; Hendry, 1984; Darrigol, 1992), but the contem- 
porary (and historically inaccurate) interpretation of the Correspondence Principle 
as the idea that all of classical physics should be a certain limiting case of quantum 
physics seems of much later date (cf. Landsman, 2007a; Bokulich, 2008). 
Ironically, the possibility of giving a dual classical—quantum description of mea- 
surement apparatuses, though obviously crucial for the consistency of the Copen- 
hagen Interpretation, simply seems to have been taken for granted, whereas also the 
more ambitious problem of explaining at least the appearance of the classical world 
(i.e. beyond measurement devices) from quantum theory—which is central to cur- 
rent research in the foundations of quantum mechanics—is not to be found in the 
writings of Bohr (who, after all, saw the explanation of experiments as his job). 
Perhaps Heisenberg could have used the excuse that he regarded the problem as 
solved by his 1927 paper on the uncertainty relations; but on both technical and con- 
ceptual grounds it would have been a feeble excuse. One of the few expressions of at 
least some dissatisfaction with the situation from within the Copenhagen school—if 
phrased ever so mildly—came from Bohr’s former research associate Landau: 


‘Thus quantum mechanics occupies a very unusual place among physical theories: it con- 
tains classical mechanics as a limiting case, yet at the same time it requires this limiting 
case for its own formulation.’ (Landau & Lifshitz, 1977, p. 3) 


In other words, the relationship between the (generalized) Correspondence Principle 
and the doctrine of classical concepts needs to be clarified, and such a clarification 
should hopefully also provide the key for the solution of the grander problem of 
deriving the classical world from quantum theory under appropriate conditions. 
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As a first step to this end, Bohr’s conceptual ideas should be interpreted within 
the formalism of quantum mechanics before they can be applied to the physical 
world, an intermediate step Bohr himself seems to have considered superfluous: 


‘I noticed that mathematical clarity had in itself no virtue for Bohr. He feared that the 
formal mathematical structure would obscure the physical core of the problem, and in any 
case, he was convinced that a complete physical explanation should absolutely precede the 
mathematical formulation.’ (Heisenberg, 1967, p. 98) 


Fortunately, von Neumann did not return the compliment, since beyond its brilliant 
mathematical content, his Mathematische Grundlagen der Quantenmechanik from 
1932 devoted considerable attention to conceptual issues. For example, he gave the 
most general form of the Born rule (which is the central link between experimen- 
tal physics and the Hilbert space formalism), he introduced density operators for 
quantum statistical mechanics (which are still in use), he conceptualized projection 
operators as yes-no questions (paving the way for his later development of quantum 
logic with Birkhoff, as well as for Gleason’s Theorem and the like), in his analysis 
of hidden variables he introduced the mathematical concept of a state that became 
pivotal in operator algebras (including the algebraic approach to quantum mechan- 
ics), en passant also preparing the ground for the theorems of Bell and Kochen & 
Specker (which exclude hidden variables under physically more relevant assump- 
tions than von Neumann’s), and, last but not least, his final chapter on the measure- 
ment problem formed the basis for all serious subsequent literature on this topic. 
Nonetheless, much as Bohr’s philosophy of quantum mechanics would benefit 
from a precise mathematical interpretation, von Neumann’s mathematics would be 
more effective in physics if it were supplemented by sound conceptual moves (be- 
yond the ones he provided himself). Killing two birds with one stone, we implement 
the doctrine of classical concepts in the language of operator algebras, as follows: 


The physically relevant aspects of the noncommutative operator algebras of quantum- 
mechanical observables are only accessible through commutative algebras. 


Our Bohrification program, then, splits into two parts, which are distinguished by 
the precise relationship between a given noncommutative operator algebra A (rep- 
resenting the observables of some quantum system, as detailed below) and the com- 
mutative operator algebras (i.e. classical contexts) that give physical access to A. 

While delineated mathematically, these two branches also reflect an unresolved 
conceptual disagreement between Bohr and Heisenberg about the status of clas- 
sical concepts (Camilleri, 2009b). According to Bohr—haunted by his idea of 
Complementarity—only one classical concept (or one coherent family of classi- 
cal concepts) applies to the experimental study of some quantum object at a time. 
If it applies, it does so exactly, and has the same meaning as in classical physics; 
in Bohr’s view, any other meaning would be undefined. In a different experimental 
setup, some other classical concept may apply. Examples of such “complementary” 
pairs are particle versus wave (an example Bohr stopped using after a while), space- 
time description versus “causal description” (by which Bohr means conservation 
laws), and, in his later years, one “phenomenon” (i.e., an indivisible unit of a quan- 
tum object plus an experimental arrangement) against another. For example: 
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‘My main purpose (...) is to emphasize that in the phenomena concerned we are (...) deal- 
ing with a rational discrimination between essentially different experimental arrangements 
and procedures which are suited either for an unambiguous use of the idea of space loca- 
tion, or for a legitimate application of the conservation theorem of momentum (...) which 
therefore in this sense may be considered as complementary to each other (...) Indeed we 
have in each experimental arrangement suited for the study of proper quantum phenomena 
not merely to do with an ignorance of the value of certain physical quantities, but with the 
impossibility of defining these quantities in an unambiguous way. (Bohr, 1935, p. 699). 


Heisenberg, on the other hand, seems to have held a more relaxed attitude towards 
classical concepts, perhaps inspired by his famous 1925 paper on the quantum- 
mechanical reinterpretation (Umdeutung) of mechanical and kinematical relations, 
followed by his equally great paper from 1927 already mentioned. In the former, 
he introduced what we now call quantization, in putting the observables of classical 
physics (i.e. functions on phase space) on a new mathematical footing by turning 
them into what we now call operators (initially in the form of infinite matrices), 
where they also have new properties. In the latter, Heisenberg tried to find some op- 
erational meaning of these operators through measurement procedures. Since quan- 
tization applies to all classical observables at once, all classical concepts apply si- 
multaneously, but approximately (ironically, like most research on quantum theory 
at the time, the 1925 paper was inspired by Bohr’s Correspondence Principle). 

To some extent, then, Bohr’s view on classical concepts comes back mathemati- 
cally in exact Bohrification, which studies (unital) commutative C*-subalgebras C 
of a given (unital) noncommutative C*-algebra A, whereas Heisenberg’s interpreta- 
tion of the doctrine resurfaces in asymptotic Bohrification, which involves asymp- 
totic inclusions (more specifically, deformations) of commutative C*-algebras into 
noncommutative ones. So the latter might have been called Heisenbergification in- 
stead, but in view of both the ugliness of this word and the historical role played by 
Bohr’s Correspondence Principle just alluded to, the given name has stuck. 

The precise relationship between Bohr’s and Heisenberg’s views, and hence also 
between exact and asymptotic Bohrification, remains to be clarified; their joint ex- 
istence is unproblematic, however, since the two programs complement each other. 


e Exact Bohrification turns out to be an appropriate framework for: 


— The Born rule (for single case probabilities). 

— Gleason’s Theorem (which justifies von Neumann’s notion of a state as a pos- 
itive linear expectation value, assuming the operator part of quantum theory). 

— The Kochen—Specker Theorem (excluding non-contextual hidden variables). 

— The Kadison—Singer Conjecture (concerning uniqueness of extensions of pure 
states from maximal commutative C*-subalgebras of the algebra B(H) of all 
bounded operators on a separable Hilbert space H to B(H)). 

— Wigner’s Theorem (on unitary implementation of symmetries of pure states 
with transition probabilities, and its analogues for other quantum structures). 

— Quantum logic (which, if one adheres to the doctrine of classical concepts, 
turns out to be intuitionistic and hence distributive, rather than orthomodular). 

— The topos-theoretic approach to quantum mechanics (which from our point 
of view encompasses quantum logic and implies the preceding claim). 
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e Asymptotic Bohrification, on the other hand, provides a mathematical setting for: 


— The classical limit of quantum mechanics. 

— The Born rule (for probabilities measured as long-run frequencies). 
— The infinite-volume limit of quantum statistical mechanics. 

— Spontaneous symmetry breaking (SSB). 

— The Measurement Problem (highlighted by Schrédinger’s Cat). 


On the philosophical side, the limiting procedures inherent in asymptotic Bohrifi- 
cation may be seen in the light of the (alleged) phenomenon of emergence. From 
the philosophical literature, we have distilled two guiding thoughts which, in our 
opinion, should control the use of limits, idealizations, and emergence in physics 
and hence play a paramount role in this book. The first is Earman’s Principle: 


‘While idealizations are useful and, perhaps, even essential to progress in physics, a sound 
principle of interpretation would seem to be that no effect can be counted as a genuine 
physical effect if it disappears when the idealizations are removed.’ (Earman, 2004, p. 191) 


The second is Butterfield’s Principle, which in a sense is a corollary to Earman’s 
Principle, and should be read in the light of Butterfield’s own definition of emer- 
gence as ‘behaviour that is novel and robust relative to some comparison class’, 
which among other virtues removes the reduction-emergence opposition: 


“there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to 
the limit, i.e. for finite V. And it is this weaker behaviour which is physically real.” 
(Butterfield, 2011, p. 1065) 


Indeed, the link between theory and reality stands or falls with an adherence to these 
principles, for real materials (like a ferromagnet or a cat) are described by the quan- 
tum theory of finite systems (i.e., h > 0 or N < ©, as opposed to their idealized 
limiting cases i = 0 or N = 9), and yet they do display the remarkable phenom- 
ena that strictly speaking are only possible in the corresponding limit theories, like 
symmetry breaking, or the fact that cats are either dead or alive, as a metaphor for 
the fact that measurements have outcomes. This simple observation shows that any 
physically relevant conclusion drawn from some idealization must be foreshadowed 
in the underlying theory already for positive values of hi or finite values of N. 

Despite their obvious validity, it is remarkable how often idealizations violate 
these principles. For example, all rigorous theories of spontaneous symmetry break- 
ing in quantum statistical mechanics (Bratteli & Robinson, 1981) and in quantum 
field theory (Haag, 1992) strictly apply to infinite systems only, since ground states 
of finite quantum systems are typically unique (and hence symmetric), whilst ther- 
mal equilibrium states of such systems are even always unique (see also Chapter 
10). As explained in Chapter 11, the “Swiss” approach to the measurement problem 
based on superselection rules faces a similar problem, and must be discarded for that 
reason. Bohr’s doctrine of classical concepts is particularly vulnerable to Earman’s 
Principle, since classical physics (in whose language we are supposed to express the 
account of all evidence) is not realized in nature but only in the human mind, so to 
speak. This necessitates great care in implementing this doctrine. 
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Interestingly, in his famous lecture “Uber das Unendliche”, in which he ex- 
pounded his finitary program intended to save mathematics against the devilish in- 
tuitionist challenge of L-E.J. Brouwer, Hilbert (1925) expressed similar principles 
controlling the use of infinite idealizations in mathematics: 


“Und so wie bei den Grenzprozessen der Infinitesimalrechnung das Unendliche im Sinne 
des Unendlichkleinen und des Unendlichgrofen sich als eine bloBe Redensart erweisen lieB, 
so miissen wir auch das Unendliche im Sinne der Unendlichen Gesamtheit, wo wir es jetzt 
noch in den SchluBweisen vorfinden, als etwas bloB scheinbaren erkennen. Und so wie das 
Operieren mit dem Unendlichkleinen durch Prozesse im Endlichen ersetzt wurde, welche 
ganz dasselbe leisten und zu ganz denselben eleganten formalen Beziehungen fiihren, so 
miissen tiberhaupt die SchluBweisen mit dem Unendlichen durch endliche Prozesse ersetzt 
werden, die gerade dasselbe leisten, d.h. dieselben Beweisginge und dieselben Methoden 
der Gewinning von Formeln und Sitzen erméglichen.” (Hilbert, 1925, p. 162).° 


In addition, asymptotic Bohrification has three rather more technical roots: 


1. A new approach to quantization theory developed in the 1970s under the name 
of deformation quantization (Berezin, 1975; Bayen et al, 1978), where the non- 
commutative algebras characteristic of quantum mechanics arise as deforma- 
tions of Poisson algebras. In Rieffel’s (1989, 1994) approach to deformation 
quantization, further developed in Landsman (1998a), the deformed algebras are 
C*-algebras, and hence the apparatus of operator algebras and noncommutative 
geometry (Connes, 1994) becomes available. Deformation quantization gives a 
mathematically precise and physically relevant meaning to the limit h + 0, and 
shows that quantization and the classical limit are two sides of the same coin. 

2. The mathematical analysis of the BCS-model of superconductivity initiated by 
Bogoliubov (1958) and Haag (1962), which, in the more general setting of mean- 
field models of solid state physics, culminated in the work of Bona (1988, 2000), 
Raggio & Werner (1989), and Duffield & Werner (1992). These authors showed 
that in the macroscopic limit N — o, non-commutative algebras of quantum- 
mechanical observables (which are typically tensor powers of matrix algebras 
M,(C)) converge to some commutative algebra (typically consisting of all con- 
tinuous functions on the state space of M,,(C)), at least for macroscopic averages. 

3. The role of low-lying states and the ensuing instability of ground states under tiny 
perturbations in the two limits at hand, discovered by Jona-Lasinio, Martinelli, & 
Scoppola (1981) for the classical limit i — 0, and by Koma &Tasaki (1994) for 
the macroscopic limit N — oe. In combination with the previous items, this led to 
a new approach to the measurement problem (Landsman & Reuvers, 2013) and 
to spontaneous symmetry breaking and emergence (Landsman, 2013), which in 
particular addresses these issues in the framework of asymptotic Bohrification. 


6 «Just as in the limit processes of the infinitesimal calculus, the infinite in the sense of the infinitely 
large and the infinitely small proved to be merely a figure of speech, so too we must realize that 
the infinite in the sense of an infinite totality, where we still find it in deductive methods, is an 
illusion. Just as operations with the infinitely small were replaced by operations with the finite 
which yielded exactly the same results and led to exactly the same elegant formal relationships, 
so in general must deductive methods based on the infinite be replaced by finite procedures which 
yield exactly the same results, i.e., which make possible the same chains of proofs and the same 
methods of getting formulas and theorems.’ (Benaceraff & Putnam, 1983, p. 184). 
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This book is organized into two parts. Rather than following the partition of 
our approach into exact and asymptotic Bohrification, these parts reflect the (math- 
ematical) sophistication of the material, starting with finite sets, and ending with 
a combination of C*-algebras and topos theory. Part I, called Co(X) and B(H), 
gives a mathematical introduction to both classical and quantum mechanics from 
an operator-algebraic point of view, in which these theories are kept separate, whilst 
mathematical analogies are stressed whenever possible. This part emphasizes the 
notion of symmetry, and includes some of the main abstract mathematical results 
about quantum mechanics (i.e., those not involving the study of Schrédinger op- 
erators and concrete models), such as the Born rule, the theorems of Gleason and 
Kochen & Specker already mentioned, the one of Wigner (on symmetries) and its 
numerous derivatives, including a new one on unitary implementability of symme- 
tries of the poset @(B(H)) of unital commutative C*-subalgebras of B(H), and 
Stone’s Theorem on unitary implementability of time evolution in quantum me- 
chanics. This part may also serve as a reference for such fundamental theorems 
about quantum mechanics. An unusual ingredient of this part is our discussion of 
the Kadison—Singer Conjecture, included because of its fit into (exact) Bohrification. 
Also elsewhere, results are (re)phrased in a language appropriate to this ideology. 

Experts in the C*-algebraic approach to quantum mechanics will be able to read 
the second part independently of the first (which they might therefore skip if they 
find it to be too elementary), but the spirit of Bohrification will only be instilled in 
the reader if (s)he reads the entire book; indeed, it is this very spirit that keeps the 
two parts together and turns the book into a whole. Part II, entitled Between Co(X) 
and B(H), starts with a survey of some known results on the grey area between clas- 
sical and quantum, such as Bell’s Theorem(s) and the so-called Free Will Theorem. 
It then embarks on the asymptotic Bohrification program, including (deformation) 
quantization and the classical limit (including a small excursion into indistinguish- 
able particles), large systems and their (thermodynamic) limit, and the Born rule 
(revisited). This part centers on a somewhat idiosyncratic treatment of spontaneous 
symmetry breaking (SSB) and the closely related measurement problem of quan- 
tum mechanics, which is given an unusual but technically precise formulation in the 
spirit of the Copenhagen Interpretation, and hence is meant to be relevant to actual 
experimental physics (which is what the Copenhagen Interpretation covers). 

Our treatment of both quantization and SSB relies mathematically on continu- 
ous bundles of C*-algebras, while the principles of Earman and Butterfield provide 
philosophical guidance. This is also true for our approach to the measurement prob- 
lem, which combines elements of quantization and SSB. Although experiments and 
detailed theoretical models are lacking so far, this powerful combination of mathe- 
matical and philosophical tools leads to a compelling scenario for solving the mea- 
surement problem, harboring the hope of finally laying this problem to rest. Like 
dynamical collapse models that require modifications of quantum mechanics, our 
scenario looks at the wave-function realistically, and hence describes measurement 
as a physical process, including the collapse that settles the outcome (as opposed to 
reinterpretations of the uncollapsed state, as in modal or Everettian interpretations). 
However, in our approach collapse takes place within unitary quantum theory. 
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Insolubility theorems for the measurement problem are circumvented, because 
these rely on the counterfactual that if y,, were the initial state, then for each n it 
would evolve (linearly) according to the Schrddinger equation with given Hamilto- 
nian h, whereas if the initial state were Y°,,CnWn, also then it would evolve accord- 
ing to the same Hamiltonian h. However, Butterfield’s Principle implies that this 
counterfactual is inapplicable precisely in the measurement situations it is meant 
for, because the dual description of the apparatus as both classical and quantum- 
mechanical causes extreme sensitivity of the wave-function to even the tiniest per- 
turbations of the Hamiltonian. Indeed, such perturbations dynamically enforce some 
particular outcome of the measurement. Our scenario also rejects the typical way of 
looking at measurement as a two-step process (going back to von Neumann himself 
and widely adopted in the literature ever since), i.e., of firstly a transition of a pure 
state to a mixed one (this is his ill-fated “process 1’’), followed by the registration of 
a single outcome. In real measurements (like elsewhere), pure states remain pure! If 
our scenario is correct, the mistaken impression that quantum theory seems to imply 
the irreducible randomness of nature, then arises because measurement outcomes 
are merely unpredictable “for all practical purposes”, indeed they are unpredictable 
in a way that dwarfs even the apparent randomness of classical chaotic systems. 

The final chapter on topos theory and quantum logic elaborates on ideas originat- 
ing with Isham and Butterfield. It centers on the poset @(A) of all unital commuta- 
tive C*-subalgebras of a unital C*-algebra A, ordered by inclusion; with some good- 
will, one might call @ (A) the mathematical home of Complementarity (although the 
construction applies even when A itself is commutative). The power of this poset is 
already clear in Part 1, where the special case A = B(H) leads to a new version of 
Wigner Theorem on unitary implementability of symmetries. Hamhalter’s Theorem, 
which is a far-reaching generalization of this version, then shows that @(A) carries 
at least as much information about A as the pure state space. Furthermore, @(A) 
enforces a (new) notion of quantum logic that turns out to be intuitionistic in being 
distributive but denying the law of the excluded middle (on which both classical 
logic and the non-distributive quantum logic of Birkhoff—von Neumann are based). 
Finally, @(A) gives rise to a quantum phase space (which is lacking in the usual 
formalism), on which observables are functions and states are probability measures, 
just like in classical physics (but now “internal” to a particular topos, i.e., a mathe- 
matical universe alternative to set theory, in which logic is typically intuitionistic). 

About a third of the book is devoted to mathematical appendices. Those on func- 
tional analysis and operator algebras give thorough introductions to these subjects, 
sparing the reader the effort to study books like Bratteli & Robinson (1981), Con- 
way (2007), Dudley (1989), Kadison & Ringrose (1983, 1986), Lance (1995), Ped- 
ersen (1989), Reed & Simon (1972), Schmiidgen (2012), and Takesaki (2002, 2003). 
The appendices on logic, category theory, and topos theory, on the other hand, are 
far from exhaustive (though self-contained): they provide a shortcut to the neces- 
sary parts of e.g. Johnstone (1987), Mac Lane (1998), and Mac Lane & Moerdijk 
(1992), or, alternatively, of Bell & Machover (1977) and Bell (1988). Though pri- 
marily meant to support the main body of the book, these appendices may also be of 
some interest by themselves, especially to philosophers, but even to mathematicians. 
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As a “Quick Start Guide” for readers in a hurry, we now summarize the main 
definitions in the theory of operator algebras. A C*-algebra is an associative algebra 
(over C) equipped with an involution (i.e., a real-linear map a++ a* such that 


a** =a, (ab)* =b*a*, (Aa)* =Aa*, 


for all a,b € A and A € C), as well as a norm in which A is complete (i.e., a Banach 
space), such that algebra, involution, and norm are related by the axioms 


llab|| < lal loll: 
2, 
lla"al] = |lal|*-. 


The two main classes of C*-algebras are: 


e The space Co(X) of all continuous functions f : X — C that vanish at infinity (i.e., 
for any € > 0 the set {x € X | | f(x)| > €} is compact), where X is some locally 
compact Hausdorff space, with pointwise addition and multiplication, involution 


and a norm 


I|f\]-o = sup{| f(x) |[}- 
xEX 


It is of fundamental importance for physics and mathematics that Co(X) is com- 
mutative. Conversely, Gelfand & Naimark (1943) proved that every commutative 
C*-algebra is isomorphic to Co(X ) for some locally compact Hausdorff space X, 
which is determined by A up to homeomorphism (X is called the Gelfand spec- 
trum of A). Note that Co(X) has a unit (i.e. the function Ly that is equal to 1 for 
any x) iff X is compact. 

e Norm-closed subalgebras A of the space B(H) of all bounded operators on some 
Hilbert space H for which a* € A iff a € A; this includes the case A = B(H). Here 
one uses the standard operator norm 


lla|| = suptllay||, y € 4, lly] =, 


the algebraic operations are the natural ones, and the involution is the adjoint. 
If dim(H) > 1, B(H) is a non-commutative C*-algebra. An important special 
case is the C*-algebra By(H) of all compact operators on H, which has no unit 
whenever H is infinite-dimensional (whereas B(H) is always unital). In their 
fundamental paper, Gelfand & Naimark (1943) also proved that every C*-algebra 
is isomorphic to A C B(H) for some Hilbert space space X. 


These classes are related as follows: in the commutative case A = Co(X), take 
HG); 


where the support of the measure p is X, on which Co(X) acts by multiplication 
operators, that is, mpy = fy, where f € Co(X) and y € L7(X, LU). 
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As already noted, C*-algebras were introduced by Gelfand & Naimark (1943), 
generalizing the rings of operators studied by von Neumann during 1930-1949, 
partly in collaboration with Murray (von Neumann, 1930, 1931, 1938, 1940, 1949; 
Murray & von Neumann, 1936, 1937, 1943). These rings are now called von Neu- 
mann algebras, and arise as the special case where a C*-algebra A C B(H) satisfies 


A — A” 
in which for any subset S C B(H) the commutant of S is defined by 
S’={a€ B(H)|ab=baVbeES}, 


in terms of which the bicommutant of S is given by S” = (S’)’. Equivalently, a C*- 
algebra is a von Neumann algebra M iff it is the dual of some Banach space M, 
(which is unique, and contains the so-called normal states on M). 

Generalizing von Neumann’s concept of a state on B(H), a state on a C*-algebra 
A (as first defined by Segal in 1947) is a linear map 


@a:A>C 


that is positive in that 
@(a*a) >0 


for each a € A, and normalized in that, noting that positivity implies boundedness, 
[ol] =1, 


where || - || is the usual norm on the Banach dual A*. If A has a unit 1,4, then in the 
presence of positivity, the above normalization condition is equivalent to 


The Riesz—Radon representation theorem in measure theory gives a bijective corre- 
spondence between states @ on A = Co(X) and probability measures on X, viz. 


a(f) = [ dus. 


for any f € Co(X). At the other end of the operator-algebraic world, if A = B(#), 
then any density operator p on H gives a state @ on B(H) by 


co(a) = Tr(pa), 


but if H is infinite-dimensional there are other states, which cannot be normal. Such 
“singular” states are the C*-algebraic analogues of improper eigenstates for eigen- 
values in the continuous spectrum of some self-adjoint operator (think of position or 
momentum), and hence they make perfect sense physically. Singular states play an 
important role also mathematically, especially in the Kadison—Singer Conjecture. 
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Let me close this Introduction with a small personal note on the way this book 
came into being. Of the three disciplines relevant to the foundations of physics, 
namely mathematics, physics, and philosophy, my expertise has always been lo- 
cated within the first two, more specifically in mathematical physics. Nonetheless, 
my interest in the foundations of physics was triggered already at school, notably 
by books like The Dancing Wu-Li Masters by Gary Zukav, The Tao of Physics by 
Fritjof Capra (both of which may appear suspicious in hindsight), and especially 
by Werner Heisenberg’s fascinating (though historically unreliable) autobiography 
Physics and Beyond (called Der Teil und das Ganze in German). The second auto- 
biography that made a huge impression on me at the time was Bertrand Russell’s, 
which in particular made me want to go to Cambridge and become a so-called Apos- 
tle G.e. a member of an elitist secret conversation society that once included such 
illustrious members as Moore, Keynes, Hardy, and Russell himself); the first dream 
was eventually realized (see below), about the second I have to remain silent. 

My interest in foundations was reinforced by two books on general relativity 
which I read as a first-year physics student, namely Raum - Zeit - Materie by Wey] 
(1918) and The Mathematical Theory of Relativity by Eddington (1923). Although 
these were beyond my grasp at the time, they were clearly written in the spirit of 
Newton’s Principia, in that they were primarily treatises in natural philosophy, for 
which mathematical physics just provided the technical underpinning. Nonetheless, 
despite an unforgettable seminar by Jan Hilgevoord on the Heisenberg uncertainty 
relations in 1984, reporting on his recent joint work with Jos Uffink, foundations 
remained dormant during my undergraduate and PhD years (1981-1989). 

As a postdoc in Cambridge from 1989 onwards, I initially attended all seminars 
in any subject related to mathematics and/or physics I found remotely interesting, 
including the so-called Sigma Club, which at the time was organized by Michael 
Redhead. Michael was surrounded by a group of people I began to increasingly like, 
although I was and still am worried by their deification of John Bell (one speaker 
even asked his audience to stand whilst he was reading a passage from Speakable 
and Unspeakable in Quantum Mechanics). In any case, I was very kindly invited 
to speak at the Sigma Club on my recent paper on superselection rules and the 
measurement problem (whose approach I now eschew, since it violates Earman’s 
Principle, see above as well as Chapter 11 below), followed by a private dinner in 
the posh Riverside Restaurant with Michael (who asked my opinion about David 
Lewis, whom I unfortunately had never heard of). Indeed, the generosity of inviting 
an absolute beginner in the philosophy of physics to speak in such a prestigious 
seminar endeared me even further to both the subject and the community. 

My main business remained mathematical physics, but, reinforcing the earlier 
spark I had got from reading Wey] and Eddington (and later also from von Neumann 
as well as Newton), two people (unfortunately no longer with us) made it clear to 
me that the goal of this discipline may include not only mathematics and physics, 
but also foundations, i.e., natural philosophy. These were Rob Clifton, who was a 
PhD student of Redhead and Butterfield, and Rudolf Haag, in whose group I had 
the honour to work during my year at Hamburg (1993-1994) as an Alexander von 
Humboldt Fellow (this was Haag’s last active year at the university, cf. Haag, 2010). 
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My first book in 1998, which I wrote during my last two years at Cambridge, 
when the prospect of having to leave Academia and hence the urge to leave a per- 
manent record loomed large, did not yet reflect this attitude. But my lengthy article 
on the classical-quantum interface in the Handbook of the Philosophy of Physics 
edited by Butterfield and Earman already did, and so does the present book. 

There is an inherent danger in a mathematical physics approach to foundations: 


‘I’m guided by the beauty of our weapons’ (Leonard Cohen) 


Our mathematical weapons, that is; this book is predicated on the idea that operator 
algebras provide the right language for quantum theory. If they don’t—for example, 
if path integrals are really its essence, as researchers especially in quantum gravity 
seem to believe, and there turns out to be a difference between the two toolkits—the 
mathematical underpinning of Bohrification would fall. Since our conceptual pro- 
gram is closely linked to this mathematical language, it would presumably collapse, 
too. Even if operator algebras stand, once some noncommutative alien gets direct 
access to the quantum world in defiance of Bohr’s doctrine of classical concepts, the 
conceptual framework behind Bohrification (and with it much of this book) would 
tremble. So far there has been no evidence for any of this, and as long as physics 
remains an empirical science I offer this book to the reader both as an introduction 
to modern mathematical methods in physics (in so far as these are relevant to foun- 
dational questions), and also as an alternative to various interpretations of quantum 
mechanics that seem to philosophize the physics of the problems away. 


Notes 


Each chapter is followed by a section called Notes, in which background and credits 
for the results in the given chapter are given. Such information is therefore absent in 
the main text (expect when—typically famous—theorems are named after their dis- 
coverers, like Gleason, Wigner, and the like). This Introduction, which anomalously 
contains some references, is an exception, but we still provide some notes to it. 

Since this book is not an exegesis of Bohr but rather an exposition of some math- 
ematical ideas partly inspired by his work (with no claim to retroactive endorsement 
by Bohr or his followers), we hardly relied on the secondary literature on his phi- 
losophy, except, as already mentioned, on Scheibe (1973) and Beller (1999), both 
of which are pretty critical of Bohr. For a more balanced picture, one might consult 
monographs like Folse (1985), Murdoch (1987), McEvoy (2001), Brock (2003), the 
collection of essays edited by Faye & Folse (2017), as well as Dieks (2016a) and 
Zinkernagel (2016). Secondary literature on Heisenberg’s philosophy of physics is 
scarce, but includes Camilleri (2009b). Though irrelevant to the present book, one 
cannot resist mentioning Landsman (2002) on Heisenberg’s controversial political 
war record, from which he tried to escape by writing the intriguing essay Ordnung 
der Wirklichkeit, published 50 years later as Heisenberg (1994). 

A propos, notes on von Neumann and operator algebras follow §C.25. 
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Strictly speaking, no previous knowledge of quantum mechanics is needed to un- 
derstand this book, but it is hard to imagine readers of this book without such a back- 
ground. Beyond standard undergraduate physics courses, for mathematically seri- 
ous introductions to quantum mechanics—further to von Neumann (1932), which 
founded the subject—we recommend Bongaarts (2015), Gustafson & Sigal (2003), 
Hall (2013), Takhtajan (2008), and Thirring (2002). No previous acquaintance with 
the philosophy of quantum theory is required either, but once again it might be 
expected that typical readers of the present book have at least some awareness of 
this field. In fact, the author himself has only read a few such books from cover to 
cover, including Heisenberg (1958), Jammer (1966, 1974), Scheibe (1973), Earman 
(1986), van Fraassen (1991), Bub (1997), Beller (1999), and Wallace (2012). 

From these books, apart from its obvious source Heisenberg (1958), Bohrifi- 
cation (at least in its ‘exact’ variant) is conceptually akin to the program of Bub 
(1997), which was based on Clifton & Bub (1996); the past tense seems appropri- 
ate here, since Bub has meanwhile abandoned this program in favour of foundations 
based on information theory (Bub, 2004). Anyway, given some preferred observable 
a © B(H)sq and pure state e € Y(H) (i.e., a one-dimensional projection on H), the 
Bub-Clifton approach looks for the largest C*-subalgebra A of B(H) on which one 
may define something like a hidden variable compatible with the Born probabili- 
ties emanating from the given state e (the emphasis on some given e comes form 
the modal interpretation(s) of quantum mechanics). For generic states e and observ- 
ables a, this typically allows A to be noncommutative, which blasts the conceptual 
framework of exact Bohrification. Requiring compatibility with quantum mechanics 
for arbitrary states e, on the other hand, would force A to be commutative. All this 
relates to the Kochen—-Specker Theorem; see the Notes to 86.1 for further details. 

Finally, though remote from Wallace (2012) in our attempt to solve (or, in the 
light of the first quotation below, one should say “address”’) the measurement prob- 
lem through physics rather than philosophy, even with this polar opposite author we 
share the following attitude towards the foundations of quantum mechanics: 


‘The basic thesis of this book is that there is no quantum measurement problem (...) What 
I mean is that there is actually no conflict between the dynamics and ontology of (unitary) 
quantum theory and our empirical observations. (...) [I do not] wish to be read as offering 
yet one more “interpretation of quantum mechanics”. 


This book takes an extremely conservative approach to quantum mechanics (...) quantum 
mechanics can be taken literally (...) there is just unitary quantum mechanics. 


The way in which cats or tables exist is as structures within the underlying microphysics 
(...) [they are] emergent objects, higher-order entities.’ (Wallace, 2012, pp. 1, 2, 13, 38, 40) 


But although it may indeed apply to the town of Oxford, one might take issue with: 


‘It is simply false that there are alternative explanatory theories to Everett-interpreted quan- 
tum mechanics which can reproduce the predictions of quantum theory (...) The Everett 
interpretation is the only game in town.’ (Wallace, 2012, p. 43) 


Part I 
Co(X) and B(H) 


Chapter 1 
Classical physics on a finite phase space 


Throughout this chapter, X is a finite set, playing the role of the configuration space 
of some physical system, or, equivalently (as we shall see), of its pure state space (in 
the continuous case, X will be the phase space rather than the configuration space). 
One should not frown upon finite sets: for example, the configuration space of N 
bits is given by X = 2", where for arbitrary sets Y and Z, the set Y“ consists of all 
functions x : Z + Y, and for any N € N we write N = {1,2,...,N} (although, fol- 
lowing the computer scientists, 2 usually denotes {0, 1}). More generally, if one has 
a lattice A C Z4 and each site is the home of some classical object (say a “spin”) that 
may assume N different configurations, then X = N“, in that x: A — N describes 
the configuration in which the “spin” at site n € A takes the value x(n) € N. 
Although the setting is a priori deterministic, in that (knowing) some point x € 
X in its guise as a pure state at least in principle determines everything (there is 
to say), the mathematical language will be probabilistic. Even within the confines 
of classicality this allows one to do statistical physics, and as such it also sheds 
light on e.g. the special status of x as an extreme probability measure (see below). 
Furthermore, the use of this language may be motivated by the goal of describing 
classical and quantum mechanics as analogously as possible at this elementary level. 
The following concepts play a central role in this chapter. Recall that the power 
set P(X) of X is the set of all subsets of X (for finite X, these are all measurable). 


Definition 1.1. 7. An event is a subset U CX, i.e, U € A(X). 

2. A probability distribution on X is a function p: X — [0,1] such thatY., p(x) =1. 

3. A probability measure on X is a function P: P(X) — [0,1] such that P(X) =1 
and P(U UV) = P(U) + P(V) whenever UNV = 9. 

4. For a given probability measure P on X, and an event V C X such that P(V) > 0, 
the conditional probability P(U|V) of U given V is defined by 


P(UUNV) 


EO) = pay 


(1.1) 
5. A random variable on X is a function f :X > R. 
6. The spectrum of a random variable f is the subset o(f) = {f (x) |x EX} of R. 
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1.1 Basic constructions of probability theory 


Probability distributions p and probability measures P determine each other by 


PU) = ¥ p(x); (1.2) 
xeU 
P(x) = P({x}), (1.3) 


but this is peculiar to finite sets (in general, probability measures will be primary). 
Two special classes of probability measures and of random variables stand out: 


e Each y € X defines a probability distribution p, by py(x) = Oxy, or explicitly 
Py(x) = 1 if x =y and p,(x) =0 if x # y; for the corresponding probability 
measure one has P,(U) = 1 ify € U and P,(U) =Oify ZU. 

e Each event U C X defines a random variable 1y (i.e., the characteristic function 
of U) by ly (x) =1ifx €U and ly(x) =0if x ¢ U. Clearly, o(1y) = {0} when 
U =90, o(ly) = {1} when U = X, and o(ly) = {0,1} otherwise. Note that 
ly (x) = P,(U). Conversely, any random variable f with spectrum o(f) C {0,1} 
is given by f = ly for some U C X; just take U = {x € X | f(x) = 1}. Such 
functions may be construed as yes-no questions to the system (i.e. f = 1 versus 
f =0) and will lie at the basis of the logical interpretation of the theory (cf. $1.4). 


The single most important construction in probability theory is as follows. 


Theorem 1.2. A probability distribution p on X and a random variable f :X +R 
jointly yield a probability distribution pr on the spectrum o(f) by means of 


prhaAy= YY pi). (1.4) 
XEX| f (x)=A 


In terms of the corresponding probability measure P on X, one has 


ppA)=P(f=A), (1.5) 


where f =A denotes the event {x € X | f(x) =A} in X. Similarly, the probability 
measure Py on O(f) corresponding to the probability distribution pf is given by 


P(A) =P(f EA), (1.6) 


where A C o(f) and f € A denotes the event {x € X | f(x) € A} inX. 


The proof is trivial. Instead of f = A, the notation f~'({A}) might be used, and 
similarly, f~'(A) is the same as f € A. If A € o(f) is non-degenerate in that there 
is exactly one x, € X such that f(x, ) =A, then one simply has P(f =”) = p(x,). 

For example, combining both our special cases P = P, and f = ly above yields 


P,(1y =1) = 1 and P,(1y =0) =0 ify EU; (1.7) 
P,(ly =1) = 0 and P,(1y =0) =1 ify ZU. (1.8) 
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Given some probability measure P, the expectation value Ep(f) and the variance 
Ap(f) of a random variable f with respect to P are defined by, respectively, 


Ep(f) = XL Fert): (1.9) 
Ap(f) = E(f?) — Ep(f)’- (1.10) 
A simple calculation shows that Ep may be written directly in terms of P itself as 
Ep(f)= YY P(f=A)-A. (1.11) 
A€Eo(f) 


Note that Ap(f) > 0. The special role of the point measures P, may now be clarified: 


Proposition 1.3. A probability measure P takes the form P = P, for some y € X iff 
Ap(f) = 0 for all random variables f :X > R. 


Proof. For “=”, we compute Ep,(f) = f(y), and hence Ep,(f?) = f(y)*. In the 
opposite direction, take f = py, so that f? = f and hence Ap(f) = p(y) — p(y). 
The assumption Ap(f) = 0 for each f implies that either p(y) = 0 or p(y) = 1 for 
each y € X. Definition 1.1.2 then implies that p(y) = 1 for exactly one y € X. 


More generally, a collection f;,..., f, of m random variables and a (single) prob- 
ability distribution p on X jointly define a probability distribution p,,_.¢, on the 
product o(f)) x --- x o(f,) of the individual spectra by 


Pye y p(x). (1.12) 
XEX| fy (X)=A,....fn(X)=An 


Once again, this may be rewritten as 
Phy fy(Aly-++)An) =P(fi =M1,.--, fn =n), (1.13) 
where the argument of P denotes the intersection N_, (fe = Ax), i-e., 
PO = hie fi) Sek |= Ans hw ah (1.14) 
Simple calculations then yield results for the so-called marginal distributions, like 


P(fi =A,,...,fn = An) =P(fi = Digecnchi =A)), (1.15) 


where 1 </ <n. The above constructions also apply to the corresponding condi- 
tional probabilities: given m additional random variables a1,...,am, one has 


P(fi =M1,---5 fn =An|a1 = M1,---Am = An) (1.16) 
M4 1€O (fit) s-An€O(fn) 
=P(fi=M,---,f =A|a1— G,:.dm = Om). 17) 
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1.2 Classical observables and states 


Given a finite set X, we may form the set C(X) of all complex-valued functions on 
X, enriched with the structure of a complex vector space under pointwise operations: 


(A: f(x) =AF@) (A €C); (1.18) 
(Ff +8)(x) = f(x) +8). (1.19) 


We use the notation C(X) with some foresight, anticipating the case where X is no 
longer finite, but in any case, since for the moment it is, every function is contin- 
uous. Moreover, the vector space structure on C(X) may be extended to that of a 
commutative algebra (where, by convention, all our algebras are associative and are 
defined over the complex scalars) by defining multiplication pointwisely, too: 


(f-8)(x) = f(@)8Q). (1.20) 


Note that this algebra has a unit Ly, i.e., the function identically equal to 1. 
For finite X, this structure suffices for X to be recovered from C(X), as follows. 


Definition 1.4. The Gelfand spectrum 2 (A) of a (complex) algebra A is the set of 
all nonzero linear maps @ : A > C that satisfy o( fg) = @(f)@(g). 


These are, of course, precisely the nonzero algebra homomorphisms from A to C. 


Proposition 1.5. The Gelfand spectrum X(C(X)) is isomorphic (as a set) to X. 


Proof. Each x € X defines a map @, : C(X) > C by @,(f) = f(x). One obviously 
has @, € L(C(X)), so we have a map X + X(C(X)), x @,. We show that this map 
is a bijection. Injectivity is easy: if @, = @,, then f(x) = f(y) for each f € C(X), 
so taking f = 6, for each z € X gives x = y (here 6,(x) = 6,,). To prove surjectivity, 
we note that since C(X) is finite-dimensional as a vector space, with basis (6,)yex, 
each linear functional @ : C(X) — C takes the form 


=) x) FQ), (1.21) 


for some function ft : X > C. For @ € X(C(X)), find some z € X for which p(z) #0 
(this has to exist, as @ 4 0). For arbitrary w € X, imposing @(6,,6,) = @(6,,)@(6,) 
enforces 1 = 6, (which also shows that z is unique), and hence @ = @,. 


The physically relevant set R(X) of all real-valued functions on X is obviously 
a real vector space inside C(X). To recover it algebraically, we equip C(X) with an 
involution, which on an arbitrary (not necessarily commutative) algebra A is defined 
as an anti-linear anti-homomorphism that squares to id,, i.e., a linear map *:A—> A 
(written a++ a*) that satisfies (Aa)* = Aa*, (ab)* = b*a*, and a** = a. In our case 
A =C(X), which is commutative, the latter property simply becomes (fg)* = f*g*. 
In any case, we define this involution by pointwise complex conjugation, 1.e., 


f* (x) = f(x). (1.22) 
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We evidently recover the real-valued functions in the involutive algebra C(X) as 
R(X) = C(X)sa = {fF € C(X) | f° = fh. (1.23) 


Finally, although we do not need this yet, we note that C(X) has a natural norm 


Il f llc = sup{| f(x) |}- (1.24) 
xEX 


These structures turn C(X) into a commutative C*-algebra (cf. Definition C.1). 


Definition 1.6. The algebra of observables of the physical system described by the 
phase space X is C(X), seen as a (commutative) C*-algebra in the above way. 


Thence elements of C(X) are called observables (a term that really should be applied 
only to its self-adjoint elements, i.e., those satisfying f* = f). 

We have thus equipped the random variables on X with enough structure to re- 
cover X itself, and now turn to the other side of the coin, viz. the probability mea- 
sures on X. Here the relevant mathematical structure is that of a compact convex set, 
a concept we only need to define in the context of an ambient (real) vector space. 


Definition 1.7. A subset K of a (real or complex) vector space V is called convex if 
the straight line segment between any two points on K lies in K. Expressed formally, 
this means that whenever v,w € K andt € (0,1), one has tv+ (1—t)we K. 


The following probabilistic reformulation of this notion is very useful. 


Proposition 1.8. A set K C V is convex iff for any k, given k probabilities (ty,...,t,) 
(i.e., t; > O and Yt; = 1) and k points (v,...,v_) in K, one has pea tiv; EK. 


Proof. Taking k = 2 recovers Definition 1.7 from its probabilistic version. Con- 
versely, one uses induction on k, using the identity (assuming 0 < ft, < 1): 


f vytee tk-1 
1—t, 1-t% 


tv t---+tvyp = (1 n) ( nt) +0 

Any linear subspace of V is trivially convex, as is any translate thereof (i.e., any 
affine subspace of V). Another, much more important example is the convex hull 
co(S) of any subset S C V; noting that the intersection of any family of convex sets 
is again convex, co(S) may be defined as the intersection of all convex subsets of V 
that contain S, or, equivalently, as the smallest convex subset of V that contains S 
(whose existence is guaranteed by the previous remark). Proposition 1.8 then yields 


k 
co(S) = {ee [KEN (v1,...,%) € 5,4 20, = i. (1.25) 


i=1 


In particular, if S = {v1,...,v,} is a finite set, then one simply has 


k 
elfen) =| rn ln 2 0E n= (1.26) 
i=1 i 
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The convex hull of any finite set of points in R’*! is called a convex polytope. Such 
convex sets are closed and bounded (since none of the t; > 0 can walk away too far 
without violating the condition )’;¢; = 1), and hence are compact. In particular, 


An ={xER"t |x, >0,) x1 = 1} (1.27) 


is a convex polytope called a simplex. For example, A, is the line segment from 
(0,1) to (1,0) in R*. We would like to say that A, is “isomorphic” to the unit interval 
[0, 1], so we define two convex sets K,,K> to be isomorphic (as such) if there is a 
bijection f : Kj > Kp that is affine, in that for t € (0,1) and v1, v2 € Ki, we have 


f(tvi+ (1 —t)v2) =tf(v1) + (1-2) f (v2). (1.28) 


Then the function f : Aj — [0,1] given by f(A, 1—A) =A, where A € [0, 1], will do. 
Similarly, Ay C R? is isomorphic to any equilateral triangle in R? with sides of unit 
length, whereas A3 is just the tetrahedron (which is one of the five Platonic solids). 

There are many other convex polytopes (cf. §B.11), but simplices are of prime 
importance for us, since A, is isomorphic to the set Pr(X) of all probability distribu- 
tions on a set X = {0,...,2} withn-+ 1 points; the identification Pr(X) 5 p Ox € A), 
is given by x; = p(i+1). In particular, we see that for any finite set X, Pr(X) is a 
compact convex set. This is also clear from Definitions 1.1 and 1.7 (and will even 
be true for general compact phase spaces X, cf. Corollary B.17 and §C.25). 


Definition 1.9. The state space of the physical system described by a (finite) space 
X is the set Pr(X) of all probability measures on X (or, equivalently, of all probability 
distributions on X ), seen as a compact convex set. 


Thus a probability measure (or distribution) on X is often called a state (of the 
physical system described by X). The operation of passing from states P,Q € Pr(X) 
to a new state tP + (1 —1)Q € Pr(X), where t € (0,1) as usual, or, more generally, 
from a (finite) family of states (P;) and a set (t;) of probabilities (i.e., t; > 0 and 
y jt; = 1) to the convex sum )';t;P,, is called mixing. 


It is possible to recover X from its associated state space Pr(X), as follows. 


Definition 1.10. The (extreme) boundary 0.K of a convex set K consists of all 
points v € K satisfying the following condition: 


if v =tw + (1—t)x for certain w,x € K andt € (0,1), thenv =w =x. 
Elements v € 0K of the boundary are called extreme points of K. 


We will now compute the boundary of Pr(X). The result may be expressed by 
O-An = {€1,...-€n +i}, (1.29) 


where (e1,....€,41) is the standard basis of R"*! (i.e., e; = (1,0,...,0), etc.). How- 
ever, we will give a direct probabilistic proof. We already noted the special proba- 
bility measures P,, x € X. The association x ++ P, defines a map from X to Pr(X). 
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Proposition 1.11. The set X is isomorphic to the boundary 0.Pr(X ) through x ++ P,. 


Proof. It is convenient to work with probability distributions p rather than prob- 
ability measures P. First, x +> p, is trivially injective from X to Pr(X): if x Ay 
then p,(x) = 1 whereas p,(x) = 0, so py # py. Second, p, € 0,Pr(X). For sup- 
pose one has p, = tp+(1—t)q for some p,q € Pr(X) and ¢ € (0,1). Hence 
Px(y) =tp(y) + (1 —t)q(y). Taking y # x yields p(y) = q(y) =0, so that p = q = py. 
Consequently, X C 0.Pr(X). 

The converse inclusion is (contrapositively) equivalent to the property that for 
any p # Py (for all x), there are g andr, g #r, andt € (0,1), with p=tq+(1—r)r. 
Indeed, if p 4 p,, there is some x9 € X with 0 < p(xq) < 1. Now define g, r, and t 
by q(xo) = 1 and q(x) =0 for all x 4 x0, r(xo) = 0 and r(x) = p(x)/(1— p(xo)), and 
finally t = p(xo). Then p =tqg+(1—t)randg¥#r. 


The simplest example would be X = {0,1}, so that Pr(X) = [0,1] by mapping the 
distribution p € Pr(X) to p(1). Since one may directly verify that 0.[0, 1] = {0,1}, 
under the above isomorphism one therefore has d,Pr(X) = {0,1}. Analogously, 
0.(0, 1) = @, so that the boundary of a convex set may apparently be empty. Hence 
we see that one remarkable ingredient of Proposition 1.11 lies in the claim that the 
convex set Pr(X) actually has a (nonempty) boundary! This is no accident: by the 
Krein-Milman Theorem (cf. §B.10), this is true for any compact convex set (which 
is consistent with the counterexample just given). For example in quantum mechan- 
ics we will encounter the case of K = B? (i.e. the closed unit ball in R3) as the state 
space of a qubit, whose (extreme) boundary is the two-sphere S$”, cf. Proposition 
2.9. Something similar is true in any dimension, but beware of surprises: if K = Az 
is an equilateral triangle in the plane, then its extreme boundary 0,K consists of the 
vertices of K (whereas its faces form the geometric boundary of the triangle). 

The general problem arises whether some point v € K of a compact convex set K 
may be written as a convex sum (or, more generally, an integral) of extreme points 
of K, and if so, to what extent this extremal decomposition 


v=) qvi, > 0, V4 =1, v1 € aK, (1.30) 


iel 


which for simplicity has been assumed to be a finite sum here, is unique. Without 
proof, we state a general result of convexity theory, called Caratheodory’s Theorem: 


Theorem 1.12. If K is a nonempty compact convex subset of R", then 0.K 4 0, and 
each point of K is a convex sum of at most n+ 1 points in 0-K. 


If K = A,, then this sum generically has n+ 1 points and is unique. Probabilistically: 


Proposition 1.13. [f X is finite, then any probability measure P € Pr(X) may be 
written in a unique way as a finite mixture of extreme probability measures, viz. 


P=) EP. (1.31) 


xEX 
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Proof. Take t, = P({x}) in the sense of Definition 1.1, or, equivalently, 4, = Ep(6,) 
in the sense of (1.9). To see that this decomposition is unique, use Proposition 1.11, 
i.e. 0.Pr(X) = X, in (1.30) to force J = X and apply both sides of (1.31) to 6,. 


The state space and the algebra of observables may also be defined in terms of 
each other. We start with the (re)construction of states from observables, where the 
following definition and proposition may leave a hybrid impression. The rationale 
behind our approach is that for many purposes it is easier to work with the com- 
plex algebra C(X), but on the other hand, compact convex sets are most naturally 
defined in terms of real vector spaces. Fortunately, it is easy to switch between the 
two: we already know how to obtain the real part R(X) from C(X), see (1.23), and 
conversely, C(X) is simply the complexification of the real vector space R(X). 


Definition 1.14. A state on C(X) is a linear map @: C(X) — C that satisfies: 


1. w(f?) > 0 for each f € C(X) with f* = f (positivity); 
2. @(1x) = 1 (normalization). 


The first condition obviously comes down to @(f) > 0 whenever f > 0 pointwise. 
Equivalently, we may define a state on R(X) as a real-linear map @g : R(X) > R 
that satisfies the very same conditions. Indeed, a state @g on R(X) defines a 
complex-linear map @ : C(X) > C by o(f +ig) = Or(f) +i@pR(g), where f,g € 
R(X). This map satisfies the same conditions of positivity and normalization. Con- 
versely, @ may be restricted to the real part R(X) of C(X), so that there is no real 
(sic) difference between @ and Wp. Hence we will use these interchangeably, often 
even dropping the suffix R on @. One advantage of this ability to switch is that a 
state @ on C(X) may be regarded as an element of the real vector space R(X)*. 
Doing so shows that the terminology of Definitions 1.9 and 1.14 is consistent: 


Theorem 1.15. There is a bijective correspondence between states @ on C(X) and 
probability measures P on X, given by @ + Ep, cf. (1.9) and (1.11). Therefore, as 
a subset of the (real) vector space R(X )* of all (real-) linear maps from R(X) to R, 
the set S(C(X)) of all states on C(X) coincides with the set Pr(X) of all probability 
measures on X. In particular, the state space S(C(X)) of C(X) is a compact convex 
set in R(X )* (as a finite-dimensional vector space with its usual topology). 


Proof. Given a state @, define a function p : X + R by p(x) = @(6,). Since 6, > 0 
pointwise, positivity of @ yields p(x) > 0. Noting that 1y = Y,6,, normaliza- 
tion then forces Y, p(x) = 1, so that p is a probability distribution on X. Hence 
P € Pr(X), where P is the probability measure corresponding to p. Conversely, 
P € Pr(X) defines a map Ep : R(X) > R by (1.9), which is positive and normalized. 
Note that compactness and convexity of the set S(C(X)) in R(X)* follow directly 
from its definition, i.e., even without knowing that it equals Pr(X). 


Consequently, we may refer to S(C(X)) as the state space of C(X) without any 
ambiguity, and we will always regard state spaces of (unital) C*-algebras A (cf. 
Appendix C) as compact convex sets S(A), where in the present case A = C(X). 


1.3 Pure states and transition probabilities 31 


1.3 Pure states and transition probabilities 


For any C*-algebra A (with unit), and hence in particular for A = C(X), elements of 
the boundary 0,S(A) are called pure states, and we call 


P(A) = 0.S(A) (1.32) 
the pure state space of A. States that are not pure are called mixed. 


Theorem 1.16. One has P(C(X)) =X, in that the following map is an isomorphism: 


X + P(C(X)), x @, O(f) = f(x). (1.33) 


Proof. Combine Proposition 1.11 and Theorem 1.15. 


For finite X this isomorphism is merely meant as a bijection between sets (and for 
general compact Hausdorff spaces X it will be a homeomorphism of topological 
spaces), but we will now introduce some additional structure on pure state spaces 
that will enrich Theorem 1.16 to an isomorphism of so-called sets with a transition 
probability. This will be necessary in order to reconstruct the observables from the 
pure states, but it also clarifies the general probabilistic structure of physics (note 
that the following definition is unusual in probability theory!). 


Definition 1.17. /. A transition probability on a set X is a function 
t:X xX > [0,1] (1.34) 
that satisfies t(x,y) = 1 iffx = y and T(x,y) = T(y,x) (symmetry). 
The simplest example of a transition probability (on any set X) is obviously 
T(x,y) = Oxy. (1.35) 


The point is that this transition probability may be derived from the classical C*- 
algebra of observables C(X) by the following formula (assuming X finite): 


Sy = inf{ f(x) | f €C(X),0< f < ly, f(y) = 1}. (1.36) 


Indeed, for x = y this is a tautology, whereas for x 4 y the infimum (which is zero) 
is attained by f = 6,. In terms of the pure state space P(C(X )), which is isomorphic 
to but not equal to X, cf. Theorem 1.16, this formula may be written as 


By =inflax(f)|FECX)0S F< lewy.o(f=1- 137) 


Furthermore (and this is the real point, so that we already have to mention it here, 
ahead of a more detailed treatment in the context of quantum mechanics), the right- 
hand side of (1.37) may be generalized to any finite-dimensional C*-algebra A by 


4 (@,@') = inf{@(a) |a€A,0 <a< 14, 0'(a) = 1}, (1.38) 
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where @, @’ € P(A). Since (1.38) clearly generalizes (1.37), for A = C(X) we have 
tT) (@,, Dy) = Sry. (1.39) 


Note that the symmetry property in Definition 1.17 is not obvious from (1.38), but 
in the classical case A = C(C) it is true by computation, and the same will hold in 
quantum theory. To motivate these definitions, we recall that f in (1.37), and like- 
wise a in (1.38), are yes-no question to the system, so that the transition probability 
t4(@, @’) monitors to what extent the states @ and w’ may be sharply distinguished 
by asking such questions. If they can, there should be some question a for which 
o' (a) = 1 and @(a) =0, so that t4(@, @’) (if @ 4 @’, of course). As we have seen, 
in the classical case this can always be done. However, we shall see this is no longer 
the case in quantum mechanics, where pure states may be thus distinguished iff they 
correspond to orthogonal unit vectors in Hilbert space. Further motivation for the 
expression (1.38) is post hoc, as it turns out to allow a reconstruction of the vec- 
tor space of observables A, supplemented by the part of its algebraic structure that 
determines its logical and probabilistic structure (viz. the ability to form squares, 
a+ a’) from P(A) with its associated transition probability. See Theorem C.179. 

First, we develop some theory that puts both classical and quantum mechanics 
into a more general setting. Notwithstanding the formal incorporation of the former, 
the underlying Hilbert space thinking will be obvious throughout. 


Definition 1.18. Let (X,7) be a set with a transition probability. 


1. A subset O C X is orthonormal if t(x,y) = 5, for all x,y € O. 
2. A basis of a set X with a transition probability Tt is an orthonormal family B C X 
such that for each x € X one has 


Seu) = 1. (1.40) 


uceB 


A basis of a subset S C X is an orthonormal family B Cc S such that (1.40) holds 
for each x € S. Relative to such a basis B of S, we define Ts : X — R by 


t5(x) = )° t(x,u). (1.41) 


ueB 
As a special case, for S = {u} we write 7, = %, so that 
T(x) = T(x, u). (1.42) 
3. The orthocomplement S+ of some subset S C X is defined as 
St = {ye X | t(x,y) =0Vx € S}. (1.43) 


4. A subset S C X is orthoclosed if S++ = S (where S‘+ = (S+)+). 
5. A resolution of the identity in X is a family of orthogonal orthoclosed subsets 
(Sj)j (ie, Txi,x;) =0 fx; € Si, xj € Sj, andi F j), for which Yj Ts, = ly. 
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6. An observable for the pair (X ,7) is a bounded function f : X — R of the form 


f=YVc-G,, cE R, yw EX. (1.44) 


The real vector space of such observables is called €°(X ,T). 
7. A spectral resolution of an observable f € €°(X,T) is a decomposition 


Fae Me ecye (1.45) 
a 


where (Sj) is a resolution of the identity and each A € R occurs at most once. 


In the present section X is finite, whilst in the following section on quantum me- 
chanics on finite-dimensional Hilbert spaces at least all bases will be finite, so that 
there are no convergence issues. In general, B may be infinite, in which case (1.40) 
is defined as the least upper bound of all finite partial sums, and all sums in Defi- 
nition 1.18 are defined pointwise (i.e., in x). In that case, eq. (1.45) may need to be 
adapted through limit constructions. Furthermore, one may worry about the basis- 
dependence of Ts in (1.41), but fortunately it turns out that in all sets with a transi- 
tion probability that arise as pure state spaces defined by C*-algebras according to 
(1.38), the function Ts is independent of the basis B whenever S is orthoclosed. In 
that case, spectral resolutions exists and are unique, and one may turn the real vector 
space (°(X, 7) of part 6 into a Jordan algebra by defining a product o through 


Pay is (1.46) 
A 


fog =1((f+s)—(f—-g)”). (1.47) 


In the classical case this yields the pointwise product (1.20), whereas in quantum 
mechanics it recovers the anti-commutator. Both are examples of Jordan products 
(cf. §C.25), i.e., commutative products o satisfying the curious axiom (C.619). 

All this trivializes if t= tC) is given by (1.35), where X need not even be finite: 


. Any subset O C X is orthonormal. 

. The set B = X itself is the only basis of (X,7), and analogously B = S. 

. The orthocomplement S* is the set-theoretic complement S° = X\S. 

. Hence any subset $ C X is orthoclosed. 

. Any partition X = ||; Sj yields a resolution of the identity. 

. Any bounded function f : X — R is an observable, so that when X is finite, 


NOB WN 


&°(X,T) = R(X) =C(X,R); (1.48) 
7. The spectral resolution (1.45) of f is given (analogously to operator theory) by 


ce ee ae (1.49) 
AEo(f) 


cf. Definition 1.1.5. In particular, spectral resolutions in (1.48) are unique. 
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1.4 The logic of classical mechanics 


Whatever one’s route to C(X,R) as the algebra of observables, i.e. either as a start- 
ing point or as a derived concept as in (1.48), it determines the logical structure of 
classical mechanics (we here restrict ourselves to propositional logic). According to 
the general scheme reviewed in §D.2, apart from the usual logical connectives —, 
A, V, and — for not, and, or, and implies, a propositional theory needs a set Lx of 
atomic propositions. These are provided by C(X,R), and x consist of all expres- 
sions f € A (we expect no confusion between this notation for both propositions in 
logic and events in probability theory), where f : X — R is a function, and A is some 
subset of IR. As we shall see, f € A is always false if AN o(f) = 0, so we might 
as well assume that A C o(f). We write f =A for f € {A}. From these elemen- 
tary propositions, propositions are constructed inductively using the iterative rules 
of propositional logic (see §D.2). This produces a set By = By, of propositions. 

Of course, there are logical relations between our atomic propositions (and hence 
between elements of By). For example, if A C A’, then f € A should imply f € A’. 
Such relations may be formulated as axioms of some propositional theory -7y de- 
scribing the logic of classical mechanics. These axioms take the following form: 


(f EL) 3 (g€A) iff fC) Cg! (A). (1.50) 


This may also be formulated through the notion of semantic entailment. For each 
x € X, we define a valuation V, : Xx — {0,1} (cf. §D.2) by 


V.(f € A) =1 iff f(x) EA, (1.51) 


extended to a map V, : By — {0,1} through the recursive use of truth tables. Defin- 

ing the semantic entailment relation x on By by & —x B iff V.(a@) = 1 implies 

V,(B) = 1 for all x € X, it is easy to see that a — B as defined in (1.50) iff a Ey B. 
In order to compute the ensuing Lindenbaum algebra Ly = Ly,, we note that 


(f EL) + (g eA) iff fF) =g7!(A). (1.52) 


Writing ~x for ~ 7, (which is the equivalence relation given by Fx, too), we find 


(f EA) ~x (Lp-yay = 1), (1,93) 


where we recall that 1, is the characteristic (or indicator) function of A. Using the 
truth tables for A and for —, we also obtain (in terms of the complement A‘ = R\A): 


(f EL) A(g EA) Xx (Lptcryag yay = 1 (1.54) 
(af €A) Xx (f EA®) Xx (Lptcyey = 1). (1.55) 


Finally, the truth tables yield logical (and hence semantic) equivalences like 


av B ~x 7(-aA-B). (1.56) 
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Combining the specific and the general equivalences (1.53) - (1.56), we have: 


Lemma 1.19. Any proposition in By is logically (and semantically) equivalent (rel- 
ative to X) to one of the form 1y = 1, for some event U C X. Furthermore, 


(alu =1) ~x (lue = 1); (1.57) 
(ly =1)A (ly = 1) ~x (lunv = 1); (1.58) 
(lu =1)V (ly = 1) +x (uu = 1). (1.59) 


Theorem 1.20. The Lindenbaum algebra Ly is isomorphic (as a Boolean algebra) 
to the power set P(X) of X under the map ~ : Ly — A(X) induced by 


(Lf € A]x) =f (A). (1.60) 


In particular, the logical connectives —, /\ and \V (descended to Lx) turn into set- 
theoretic complementation (—)°, intersection \, and union U, respectively, in that 


e([-a@]x) = p(lalx)* (1.61) 
P([@AB]x) = P(lelx) 1 g(IB]x: (1.62) 
e([a@v B]x) = p(lelx) Ve ([B]x), (1.63) 


and ~ maps the partial order < on Lx into set-theoretic inclusion CG, i.e., 


[@]x < [Blx if p(la]x) S p([B]x). (1.64) 


This is immediate from Lemma 1.19. Interestingly, the Boolean algebra structure 
just derived as the governor of the (propositional) logic of classical mechanics may 
be reformulated in terms of the Jordan algebraic structure (1.46) - (1.47) of €°(XT), 
or, when X is finite, of the C*-algebra of observables C(X) itself: 


e Events U CX (and hence, by Theorem 1.20, logical equivalence classes of propo- 
sitions) correspond bijectively to characteristic functions ly on X, that is, with 
yes-no questions (having spectrum in {0,1}). Algebraically, these are precisely 
the idempotents in ¢*(X ,T), i.e., those functions e satisfying e* = e. 


e Interms of those, the partial ordering and the logical connectives are given by 


e<f iffeof=e; (1.65) 

me = lx -@; (1.66) 
eAf =eof; (1.67) 
eVf =e+f-—eof. (1.68) 


Indeed, in this case o is pointwise multiplication (1.20). Using ly - ly = luav 
yields (1.67), (1.65) comes down toU CV iff UNV =U, (1.66) is 1x — ly = luce, 
and (1.68) follows by writing its right-hand side as 1y — (ly —e) A (1x —f). 
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1.5 The GNS-construction for C(X) 


As a bridge from classical to quantum mechanics (as well as a good exercise), we 
finally inject some Hilbert space theory into classical physics by discussing the GNS- 
construction of C*-algebra theory for the special case of C(X), where X remains 
finite. In general, for each state @ on a C*-algebra A, the GNS-construction canon- 
ically yields a Hilbert space Hq (which is finite-dimensional for A = C(X) with 
finite X) and a representation of A on Hy, in the sense of a (complex) linear map 


To : A — B(Ho) (1.69) 

that satisfies 
Tw (ab) = Rw(a)He(b); (1.70) 
Tw (a*) = To(a)*. (1.71) 


Furthermore, H contains a special unit vector Qe that is cyclic for Z@ in that 
To (A) Qe = {%o(a)Qo,a€ A} = Ho, (1.72) 


at least in the relevant case where dim(Hw) < 0; otherwise, the left-hand side is 
merely dense in Hw and one needs to take the (norm) closure to obtain Hq. Further- 
more, 2 realizes the state @ as a quantum-mechanical expectation value by 


(4) = (Q0,%o(4) Qe) Ho (1.73) 
Given @ € S(A), the GNS-construction starts with the vector spaces 


No = {a€A| @(a*a) = 0}; (1.74) 
Ho = A/No. (1.75) 


Now, if b € N@ and a € A, then ab € No, because of the important inequality 
w(b*a*ab) < |la\|?(b*b). (1.76) 


This is true for any C*-algebra A, but below we prove it only for our example. 
Assuming (1.76) for the moment, the action of A on itself by left multiplication 
descends to a well-defined action on Hg, which we call 2. In other words, if bo € 
Hg is the image of b € A under the canonical projection A > A/No, then 


Crucially, this vector space H is equipped with a canonical inner product 
(d@,Do@) = @(a*b). (1.78) 


Indeed, this form is well defined, and is positive definite because @ is a state. 
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In general, Hw as defined by (1.75) with inner product (1.78) is merely a pre- 
Hilbert space, which needs to be completed in the associated norm, and it takes some 
effort to check that the operators defined by (1.77) are bounded. In our example, on 
the other hand, Hj is finite-dimensional and hence complete. In any case, it is easy 
to verify the properties (1.70) - (1.73), whilst (1.72) holds with the unit 1 = ly. 

We now prove (1.76) for A = C(X). Fom Theorem 1.15 we have @ = Ep, and by 
(1.9) and (1.24), the inequality (1.76) comes down to the obviously correct result 


Y fase)? < (FZ ¥le@. (1.79) 


Writing Nz, = Np, we may also check directly that if g € Np and f € C(X), then 
fg € Np. Indeed, in terms of the set supp (P) C X defined by 


supp (P) = {x € X | p(x) > 0}, (1.80) 


we have 
Np ={f €C(X) | f(x) =0Vx € supp (P)}, (1.81) 


and clearly g = 0 on supp (P) implies fg = 0 on supp (P). We now compute Hp and 
mp. From (1.81) we have f — g € Np and hence f ~ g iff f(x) = g(x) for all x € 
supp (P), where ~ is the equivalence relation whose equivalence classes fp define 
elements of Hp = C(X)/Np. Hence fp is simply the restriction of f to supp (P), and 


Hp = @(X,P) (1.82) 
is the Hilbert space that consists of these restriction, with inner product 


(fe.gr)= p(x) f(x)g(x). (1.83) 


x€supp (P) 


The representation (1.77) then trivially gives 


tp(f)gp = fPgp, (1.84) 


so that 2p(f) is the multiplication operator defined by f on (°(X,P). In functional 
analysis one often denotes elements gp € ¢7(X,P) by the functions g themselves, 
and similarly writes tp(f) as f, so that (1.84) simply reads mp(f)g = fg. 

The operator norm of zp(f) is easily computed to be 


I|te(F)|] = sup{|f(x)|,x € supp (P)} = |Lfisupp (P)le- (1.85) 
Indeed, the bound ||zp(f)|| < || fjsupp(p)|loo is immediate from the definition 
\|tP(F)|| = sup{||te(f) rll, ge € Hp, ||ge|| = 1}, (1.86) 


and equality in this bound follows from applying the operator mp(f) to the function 
g = |v, where U CX is any set where | f| attains its maximum || f\upp (p)||-- 
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Notes 


§1.1. Basic constructions of probability theory 
§1.2. Classical observables and states 

For (advanced) treatments of convexity theory and probability theory in contexts 
relevant to mathematical physics we recommend Israel (1979), Alfsen & Shultz 
(2001), and Simon (2001). 


$1.3. Pure states and transition probabilities 

Transition probabilities (in the abstract sense meant here) were introduced by von 
Neumann, but his manuscript from 1937 was only published in 1981 as von Neu- 
mann (1981/1937). This remarkable paper has remained largely unused (or even un- 
known) in both mathematical physics and operator algebras; Mielnik (1968), Shultz 
(1982), and Landsman (1996, 1997) are exceptions. An extensive discussion with 
further references may be found in Landsman (1998a). 


81.4. The logic of classical mechanics 

Unless one counts Boole (1847), it seems that the logical analysis of classical 
mechanics was initiated by the famous paper of Birkhoff & von Neumann (1936), 
which was primarily concerned with quantum logic (cf. §2.10). Our use of semantic 
implication (also in the quantum case) was inspired by Rédei (1998). 


$1.5. The GNS-construction for C(X) 
See §C.12 for the GNS-construction in general. 


Chapter 2 


Quantum mechanics on a finite-dimensional 
Hilbert space 


The quantum analogue of a finite set X (in its role as a configuration space in clas- 
sical mechanics) is the finite-dimensional Hilbert space (7(X), by which we mean 
the vector space of functions y : X — C, equipped with the inner product 


(y.9) = ¥ y(x)o(a). (2.1) 


xEX 


There is no issue of convergence here, but later on we will use the same notation 

for infinite sets X, where (?(X) is restricted to those functions (i.e. sequences) for 

which Y cx |W(x)|* <9 (which also guarantees convergence of the sum in (2.1)). 
If X nas sets (i-e., |X| =), we have a unitary isomorphism of Hilbert spaces 


Layacrt (2.2) 


through the map w+ (y(1),...,w()), where C” has the standard inner product. 
(w,z) = £;W,z;. In particular, the function & € (°(n), defined by d:(/) = dy, is 
mapped to the k’th standard basis vector u, = |k) of C”, i.e., uy = (1,0,...,0), ete. 
In the special case X = N“ considered in Chapter 1, we have |X| = N 4! and hence 


Pn) = clW") ~ (c¥)?4! — QcY=@c", (2.3) 
A 


neA 


where C’ = C' for each nn € A, so that the suffix n merely labels which copy of C% 
is meant (see §C.13 for tensor products of Hilbert spaces). Explicitly, a canonical 
unitary isomorphism (7(N“) + @, CN is given by linear extension of the map 


bx 4 @ncauy(n); (2.4) 


where x: A — N and hence y(n) € C’. Thus elements of the tensor product ®, C% 
may be seen as wave-functions on spin configuration space (and vice versa). In par- 
ticular, elementary tensor products of basis vectors in ®, C™ correspond to wave- 
functions in ¢?(M‘) that are 6-peaked at some ‘classical’ spin configuration. 
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2.1 Quantum probability theory and the Born rule 


In preparation for this chapter, the reader would do well to review Appendix A. 
The probabilistic setting of quantum mechanics is given by the following coun- 
terpart of Definition 1.1 (from which conditional probabilities are lacking, though). 


Definition 2.1. Let H be a finite-dimensional Hilbert space. 


J. A (quantum) event is a linear subspace L of H (which is automatically closed). 
2. A (quantum) probability distribution is a density operator, i.e., a positive 
operator p on H (in that (w,pw) > 0 for all w © H) such that 


Tr(p) =1. (2.5) 


We denote the set of all density operators on H by 9(H). 
3. A (quantum) random variable is a self-adjoint operator a on H (i.e., a* =a). 
4. The spectrum of a self-adjoint operator a is the set o(a) C R of its eigenvalues. 


Being positive, a density matrix p is self-adjoint, so by Theorem A.10, notably 
(A.40), and Definition 2.1.2 we have 


p=Yipilvi)(vil, pi > 0, Yepi=1, (2.6) 


where the (v;) form an orthonormal set in H and |v;)(v;| is the (orthogonal) pro- 
jection on the one-dimensional subspace C - v;. As in the classical case, one special 
class of density operators and one special class of random variables stand out: 


e Each unit vector wy € H defines a density operator 


Py =ey = |W) (VI, (2.7) 


i.e., the (orthogonal) projection ey on the one-dimensional subspace C- y. A 
basis (which by convention always means an orthonormal basis) of eigenvectors 
of Py consists of v; = y itself, supplemented by any basis (v2,..., Vaim(H)) of 
the orthogonal complement of C- y. The corresponding probabilities in (2.6) are 
evidently pj = 1 and p; = 0 for alli > 1. 

e Each quantum event L C H defines the corresponding projection e; (which is 
self-adjoint, i.e. a random variable): If (vj) is a basis of L, then ez = Y);|;)(Vj|- 
If L=H then e, = 1 with o(ez) = {1}. If L = {0} then ez =0 with o(ez) = {0}. 
In all other cases, i.e. for proper subspaces L, one has o(ez) = {0, 1}. 
Conversely, any self-adjoint operator a with spectrum o(a) C {0,1} is given by 
a =e, for some subspace L C H; just take L = {yw € H | ay = 1}. Such operators 
correspond to yes-no questions to the system and lie at the basis of the logical 
interpretation of quantum theory due to Birkhoff and von Neumann; see §2.10. 


The following quantum analogue of Theorem 1.2 is based on Theorem A. 10. 
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Theorem 2.2. A density operator p on H and a self-adjoint operator a: H + H 
jointly yield a probability distribution pg on the spectrum o(a) by the Born rule 


Pa(A) = Tr(pe,). (2.8) 
The associated probability measure P, is given at A C (a) by (cf. (A.42)) 
P,(A) = Tr (pea). (2.9) 


Proof. Positivity of the numbers p,() follows by taking the trace over a basis of 
eigenvectors v; of p, with corresponding eigenvalues p; > 0. This yields 


Tr(pex) = Dpilleail? > 0. 


Eqs. (A.38) and (2.5) then give Yy pa(A) = 1. Eq. (2.8) follows from the equality 
P,(A) =Yaea PalA), cf. (1.2), and (A.42). 


In particular, if 9 = Py, writing pi for the associated probability, (2.8) yields 


pl (A) = (wreay) = lle, yl’. (2.10) 


If in addition A € o(a) is non-degenerate, so that e, = |v, )(v,| for some unit vector 
Dy with av, =A vj, then the Born rule (2.9) assumes its original form 


ph (A) =|(w, v2). (2.11) 
Specializing (2.10) to the random variable a = e; defined by an event L C H yields 
py (1) = lec yl’. (2.12) 


If L = C- @ is one-dimensional, too, in which case we write Peo = be , we have 


Pel) =I(w els (2.13) 
note the following equality of probability distributions on o(eg) = o(ey) = {0,1}: 
pe) =py): (2.14) 


Expectation values and variances may be defined as in the classical case, viz. 
Ep(a) = Tr(pa); (2.15) 
Ap(a) = Ep (a?) — Ep(a)?. (2.16) 


Similar to (1.11), we may also write the expectation value as 


Ep(a)= ) A-pa(A). (2.17) 
A€o(a) 
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The special case p = py, for which we write Ep, = Ey, gives the usual formula 


Ey(a) = Tr (pya) = (y,ay). (2.18) 


As in the classical case one always has Ap (a) > 0, but a major contrast between 
classical and quantum mechanics lies in the following result, cf. Proposition 1.3. 


Proposition 2.3. For each density operator p there exists a self-adjoint operator b 
such that Ap(b) > 0. On the other hand, if a* = a, then Ap(a) = 0 iff the image of p 
lies in some fixed eigenspace of a, i.e., in terms of the spectral decomposition (2.6) 
we have av; = Av; where A is independent of i. 


Proof, We first prove the first claim for H = C?. By an appropriate choice of basis, 
we may assume that p is diagonal, i.e., 9 = diag(p1,p2), with p),p2 © [0,1] and 
pit p2 = 1. Now take b = 0, (ie., the first Pauli matrix), so that Tr (pb) = 0 and 
Tr (pb?) = 1. Hence Ap(b) = 1. Secondly, for general H & C”, diagonalize p and 
order the eigenvectors such that the above 2 x 2 case forms the upper left block, with 
at least one of the eigenvalues pj, p2 strictly positive. Take b to be 0, in the upper 
left corner, and zero elsewhere. This once again yields Ap(b) = 1. 
For the second claim we use (2.6), and write p; = Py,;. We note the inequality 


Ap(a) = Yi’ PiAp,(a), (2.19) 


with equality iff p;(a) = p;(a) for all i, j; this follows from convexity of the function 
x ++ x”, We now show that for any unit vector y we have Ap, = 0 iff ay = Aw. 
Assuming the latter gives Ey(a) = (w,ay) =A and likewise Ey(a?) = A7, hence 
Apy (a) = 0. In the opposite direction, using a* = a, elementary manipulations yield 


Apy (a) = |I(a—(y.aw)) ||’. (2.20) 


This clearly vanishes iff ay = (w,aw)y, soay =Ay, with A = (w,ay). 
Putting y = 0; gives Ap, = 0 iff avj = A;v;, and then Ay, p,p,(a) = 0 iff in addition 
pi(a) = p;(a) for all i, j. Since p;(a) = (v;,av;) = Aj, we obtain A; = A;. 


As first recognized by von Neumann, Theorem 2.2 may be generalized to a fam- 
ily of self-adjoint operators as long as they commute. Thus we obtain the following 
counterpart of (1.12) - (1.13): a collection aj,...,a, of n commuting self-adjoint 
operators and a (single) density operator p on H jointly define a probability distri- 
bution pa,,....a, on the product o(a,) x --- x O(a,) of the individual spectra by 


1 1 
Peay vy (Ady -++yAn) = Te (pel --e?). (2.21) 
The proof of positivity of these numbers requires the spectral projections el! to com- 
mute, which they do provided the a; commute (if the a; fail to commute, positivity 
of (2.21) is not guaranteed, although they do still sum op to unity; the possibility of 
defining joint probabilities is strictly limited to commuting random variables). 
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2.2 Quantum observables and states 


Given a finite-dimensional Hilbert space H, the set B(H) of all linear operators on H 
(which for H = C” may be identified with the set M,(C) of complex n x n matrices) 
forms an involutive algebra under the natural (pointwise) operations 


(A -a)y = A(ay); (2.22) 
(a+b)y =ay+by; (2.23) 
(ab) y = a(by), (2.24) 


and finally with a* given by the usual operator adjoint (A.15). Compare the corre- 
sponding classical expressions (1.18) - (1.20) and (1.22). Analogous to (1.24), we 
also have a norm on B(#), defined by (A.18). It follows that like its classical coun- 
terpart C(X), the involutive algebra B(H) (or, in this case, M,(C)) is a C*-algebra, 
cf. Definition C.1 in Appendix C. It crucially differs from C(X) in that B(#) is 
non-commutative. For this reason, the Gelfand spectrum, which in the classical case 
allowed us to reconstruct X from C(X), turns out to be empty, cf. Proposition 2.10 
below. Nonetheless, it makes good sense to copy Definition 1.14, mutatis mutandis: 


Definition 2.4. A state on B(H) is a complex-linear map @ : B(H) > C satisfying: 


1. @(a*a) > 0 for each a € B(H) (positivity); 
2. @(1#) = 1 (normalization). 


The state space S(B(H)) is the set of all states @ : B(H) > C. 


Physicists may not like this definition, since it involves non-observable quantities. 
As in the classical case, we may introduce the self-adjoint (or ‘real’) part of B(H): 


B(A) sa = {a € B(H) | a* =a}, (2.25) 
which is a real vector space (though not a real algebra in the usual sense, cf. §C.25). 


Definition 2.5. A state on B(H) sq is a real-linear map @ : B(H)s, > R satisfying: 
1. @(a*) > 0 for each a € B(H) with a* = a (positivity); 

2. o(1) = 1 (normalization). 

The state space S(B(H) sa) is the set of all states @ : B(H) sa > R. 


Fortunately, there is no need for a fight over this point; the discussion is similar to 
the one below Definition 1.14 and is settled as follows. 


Proposition 2.6. The state spaces S(B(H)) and S(B(H)sa) may be identified: an 
element @ of the former defines an element @p of the latter by restriction, whilst the 
unique decomposition c = a+ ib (where a* =a and b* = bare given by a= }(c+c*) 
and b = —4i(c —c*), respectively) gives @(c) = @g(a) + i@p(b). Moreover, 


||o|| = ||ox|| = 1. (2.26) 
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Here the norm on the dual (Banach) space B(H)<, of B(H)sa is given by 
||| = sup{]@(a)|,a € BUA)sa,|la|] = 1}. (2.27) 


This lemma holds for any Hilbert space H (cf. Theorem C.52), but it is instructive 
to restrict our proof to the finite-dimensional setting in which we currently work. 


Proof. The first few claims are immediate from Proposition A.22. To prove (2.26), 
it suffices to prove that for any a € B(H) one has 


|@(a)| < |all, (2.28) 


since by normalization of states the bound is saturated by a = 1. Furthermore, even 
if @ is seen as an element of B(H)* rather than B(H);,, eq. (2.28) needs to be shown 
only for self-adjoint a, for positivity of @ implies the Cauchy—Schwarz inequality 


|@(a*b) |? < @(a*a)@(b*b), (2.29) 

cf. (A.1), in which we may take a = 1, to find, assuming (2.28) for self-adjoint a, 
|a(b)? < w(b*b) < ||b"|| = [|B2, (2.30) 
where the last equality holds for any b € B(H) (turning the latter into a C*-algebra). 


Noting that b*b is self-adjoint, this gives (2.28) for any a. To prove (2.28) for a* =a, 
then, we firstly use (A.47), and secondly use Theorem 2.7 and eq. (2.6) to obtain 


|o(a)| = |Tr(pa)| =|) pi(vi,avi)| < ) pil(vi,avi)|. (2.31) 
Now let (&;) be a basis of H consisting of eigenvectors of a, so that 
(vi,av1) = (v1.6) Pj, Dong dP = 1. 
j j 
Since |Aj| < ||a|| and ¥; p; = 1, the bound (2.28) follows from the estimate 


Ypil(vj,a0i)| as PPL v.85) 1A = PPL l(vi,8))? lal =|lal|. (2.32) 
i 1 J I J 


Finally, combining (2.31) and (2.32) gives (2.28) for self-adjoint a. 


In view of this, we may work with either S(B(H)sa) or S(B(H)); denoting states 
simply by @, the context will usually show if it is defined on B(H)s, or on B(H). 
Despite its easy proof, the following result is of fundamental importance. 


Theorem 2.7. /f H is finite-dimensional, there is a bijective correspondence be- 
tween states @ on B(H) or B(H) sq and density operators p on H, given by 


(a) =Tr(pa). (2.33) 
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Proof. First note that linear algebra already yields (2.33) as a bijective correspon- 
dence between complex-linear maps @ and operators p, for example, because 


(a,b) =Tr(a*b) (2.34) 


defines an inner product on B(#H). Positivity and normalization of @ then translate 
to the corresponding properties of p. 


The quantum analogue of Theorem 1.15, then, is as follows. 


Theorem 2.8. The state space S(B(H)sa) = S(B(H)) forms a compact convex set in 
the (real) vector space B(H);, (in its w*-topology) and, putting the corresponding 
topology on D(H), eq. (2.33) defines an affine homeomorphism 


S(B(H)) © F(A). (2.35) 


Proof. Convexity of S(B(H)) holds by Definition 2.4. For compactness, by Propo- 
sition 2.6 the state space S(B(H)) is contained in the closed unit ball By of B()<,, 
which is compact in the w*-topology (in the case at hand this is simply because 
B(H)j, is finite-dimensional). It is easy to see that a convergent sequence of states 
actually converges to a state, since both conditions in Definition 2.4 are clearly pre- 
served by w* limits (in which @, > @ iff @,(a) > @(a) for each a € B(A)). 


For infinite-dimensional Hilbert spaces eq. (2.35) is false; see $4.2. At the opposite 
end, the case H = C? provides a beautiful illustration of this theorem (and more). 


Proposition 2.9. The state space S(M2(C)) of the 2 x 2 matrices is isomorphic (as 
a compact convex set) to the closed unit ball B’ = {(x,y,z) € R3 |x? +y* +27 < 1}. 
On this isomorphism, the extreme boundary (cf. Definition 1.10) 


0,B = S* = { (x,y,z) ER? |x? +y +2 =1} (2.36) 
corresponds to the set of all density matrices Pp = Py, where y € C? with ||w|| = 1. 


Proof, Any self-adjoint 2 x 2 matrix may be parametrized by (t,x, y,z) € R* as 


pliaya)=3( OF a) (2.37) 
The eigenvalues A; of p(t,x,y,z), computed from its characteristic polynomial, are 


Ag = htt Vx? +y? +2’). (2.38) 


Condition (2.5) yields t = 1. Positivity of p(1,x,y,z) is equivalent to positivity of 
its eigenvalues A;, which gives e+ y’ +22 < 1. For the second claim, note that the 
Py are just the one-dimensional projections, which in turn are the density matrices 
satisfying p* = p (or require A, = 1, A_ = 0), so x* + y*+ 2° = 1. Finally, since 
convex sums tv + (1 —t)w in B? (0 <t < 1) are given by straight line segments 
connecting w and v in R?, it immediately follows geometrically that 0.B> = S?. 
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2.3 Pure states in quantum mechanics 


In classical physics, the phase space X arose both as the Gelfand spectrum 2 (C(X)) 
of the C*-algebra of observables C(X), cf. Definition 1.4 and Proposition 1.5, 
and as the pure state space P(C(X)) of C(X), see Definition 1.10 and Theorem 
1.16. In particular, Y(C(X)) = P(C(X)) at least as sets. Because of this, any pure 
state @ € P(C(X)) is dispersion-free, since as an element of ©(C(X)) it satisfies 
o(f?) = o(f) for any f € C(X). These two definitionally different (but classically 
coinciding) guises of X will fall apart in quantum mechanics; cf. Proposition 2.3. 


Proposition 2.10. /fdim(H) > 1, the Gelfand spectrum X(B(H)) of B(H) is empty, 
i.e., there are no nonzero linear maps ®: B(H) — C that satisfy @(ab) = @(a)@(b). 

In particular, there are no nonzero linear maps @ : B(H) > C that are dispersion- 
free, i.e., satisfy A@(a) = 0, with Ag(a) = @(a?) — a(a)?. 


Proof. Suppose @ € Y(B(H)). Multiplicativity for b = a = a* implies that @ is 
positive, whereas for b = 1, it implies that @ is normalized. Hence @ must be a 
state. Now use Theorem 2.7 and use multiplicativity for b = a = a*, implying that 
Ap (a) = 0. This contradicts Proposition 2.3. 


On the other hand, the pure state space of B(H) is by no means empty, and despite 
Proposition 2.10, we will see that the special density operators py = @y in (2.7) to 
some extent do play the role of the points x € X. Let us write 


P,(H) = {e € B(H) | e? =e* =e, Tr(e) = 1} (2.39) 


for the set of all one-dimensional projections on H; note that Tr (e) = dim(eH) for 
e€ P(H). Eache € Y;(H) takes the form e = ey for some unit vector y, see (2.7). 


Lemma 2.11. A density operator p is an extreme point of the convex set D(H) of 
all density operators on H iff Pp = Py for some unit vector W € H. 


Proof. The argument is similar to the proof of Proposition 1.11. To show that py € 
0.S(B(H)), assume Py = tp; + (1 —t)p2 for some t € (0,1) and p;,p2 € S(B(A)). 
Evaluating this equality at a = |@)(o|, where @ L y yields (@,p;9) = 0 fori = 1,2, 
so that Pj = p2 = Py. Conversely, the spectral decomposition (2.6) shows that p ¢ 
0.S(B(H)) whenever p # Py for some unit vector y € H. 


Consequently, for the moment just as sets (and even as topological spaces), one has 


P(D(H)) = A\(A); (2.40) 
P(B(H)) = Yi (A), (2.41) 


where the second isomorphism is given by (2.33). Defining a state @y by 


Oy (a) = (Way), (2.42) 


cf. (2.18), the isomorphism (2.41) is the correspondence Oy <> éy, cf. (2.7). 


2.3 Pure states in quantum mechanics 47 


This isomorphism becomes more interesting if we note that both spaces are nat- 
urally equipped with transition probabilities. For P(B(H)) we canonically have 


74) (@y, Og) = inf{@y(a) |a € B(H),0<a<1y,@9(a)=1}, (2.43) 
as in (1.38) for A = B(H). Furthermore, on Y;(H) we define (with some foresight) 
7714) (e, f) =Tr(ef). (2.44) 


Theorem 2.12. The pairs (P(B(H)), 78) and (A, (H), 07!) are isomorphic 
as sets with a transition probability. In particular, we have, cf. (2.13), 

7) (@y, Op) = |(W,9)|? =Tr(eyeo) = (ey,e9). (2.45) 
Proof. The last equality is a simple computation. The first follows if we can show 
that the infimum in (2.43) is reached at a = ég. To this end, we prove that for any 
0 <a < 14 with @g(a) = 1 we must have (y,ay) > |(@, y)|”. Indeed, the condition 
g(a) = (~,a@) = 1 with ||a|| < 1 (which follows from 0 < a < 1y) and ||@|| = 1 
imply, by Cauchy—Schwarz, that ap = . Since a* =a (by positivity of a), we also 
have a: (C-@)+ — (C-@)+, so we may write a = eg +a’, with a’ = 0 and a’ 
mapping (C- @)+ to itself. Then a > 0 implies a’ > 0. If (y,ay) < |(@, y)|?, then 
(w,a’w) <0, which contradicts positivity of a’ (and hence of a). 


The theory of observables and spectral resolutions of the kind (1.45) may be 
worked out completely for the “quantum” transition probabilities in this theorem: 


Proposition 2.13. /. There is a bijective correspondence between self-adjoint op- 
erators a € B(H) and observables f on (P\(H),t7!)) a la Definition 1.18.6: 


e Given a self-adjoint operator a, define an observable fa at ey © Y\(H) by 


faley) = Tr (eya) = (yay); (2.46) 


e Given an observable f = Leite! cee define an operator az by 


af = y Cjie€j- (2.47) 


2. Each such observable f = fq has a unique spectral resolution as in (1.45), i.e., 


ta So Ae, (2.48) 


Leo(a) 


where S) is the (automatically orthoclosed) subset of P\(H) whose elements e 
satisfy eH © Hy, where Hy C H is the eigenspace for the eigenvalue A € o(a). 
3. The product defined by (1.46) - (1.47) is equal to 


ate (2.49) 
fa° fo = f(ab+ba)/2- (2.50) 
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Proof. Any spectral decomposition a = Y;A;|v;)(v;| puts fq as defined in (2.46) in 
the general form (1.44), with c; = A; and y; = éy,. The rest should be clear. 


We now turn to the quantum counterpart of Proposition 1.13. The main difference 
is that although extremal decompositions of mixed states into pure ones always 
exist, they are no longer unique. For example, for H = C”, we have 


p =diag(2/3,1/3) = 3Pu, + 3Pur = 3 (Pe, + Pe): 


where (11,2) is the standard basis of C*, and 


61 = (2/3, V1/3), 6 = (/2/3,-V/1/3). 


More generally, take any basis (w;) of H = C”, assume (2.6), and for each i for 
which \/pw; 4 0 (where \/P = Yi \/P;|Vi) (vil), define t; = ||, /Pwil|?, as well as the 
unit vector ¢; = /pwi/||,/Pwil|. Then p = Y;tipg, is an extremal decomposition of 
p. The above example corresponds to the special case t) = t2 = 1/2, with 


n=2, pi =2/3, po = 1/3, m1 = (1/V2,1/V2), wo = (1/V2,-1/V2). 


One might require the 6; to be mutually orthogonal, but even that does not imply 
uniqueness of the extremal decomposition: take, for example, p = (1/n)-1,, where 
1, is the n x n unit matrix on H = C”. Then any basis induces (2.6). 

Nonetheless, under appropriate assumptions uniqueness does follow. 


Proposition 2.14. /. Any density operator p on H has an extremal decomposition 


m 


P=) PiPy;, (2.51) 
i=1 


where m < dim(H), the p; are probabilities, and the Wj are distinct unit vectors. 
2. This decomposition can be chosen such that the yw; are mutually orthogonal, in 
which case it is unique iff each of the non-zero eigenvalues of p is simple. 


Proof. The existence of the extremal decomposition (2.51) of p follows from its 
spectral decomposition (2.6), which also proves claim 2. If p has some degenerate 
non-zero eigenvalue, the example just given yields non-uniqueness of (2.51). For the 
converse direction, use uniqueness of the decomposition (2.6) under the condition 
that each of the non-zero eigenvalues of p is simple. 


In the light of Theorem 2.7, it would be interesting to reformulate Proposition 2.14 
directly in terms of the states on B(H); note our standing assumption dim(H) < oo! 


Proposition 2.15. /. Any state @ on B(H) has an extremal decomposition 


m 
o = Y' Pid, (2.52) 
i=1 


into distinct pure states @; € P(B(H)), where m < dim(H), p; > 0, and Y; p; = 1. 
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2. The unit vectors Wj; that correspond to the pure states @j in (2.52) via (2.42) are 
mutually orthogonal (and hence are part or all of a basis of 2) iff 


la -aj| =2 G4). (2.53) 


ISS) 


. Extremal decompositions (2.52) satisfying (2.53) exist and correspond bijectively 
to orthogonal families (e;) of one-dimensional projections on H (i.e., eye | = 64je: 
and Tr (e;) = 1, respectively) for which @(e;) > 0, Y; @(e;) = 1, and 

@(ae;) = w(eja), a € B(A). (2.54) 


In terms of such a family, the decomposition (2.52) is given by 


pi = (ei); (2.55) 
oo @(ae;) 
@;(a) = ier (2.56) 


Hence an extremal decomposition (2.52) with all @ mutually orthogonal in the 
sense of (2.53) is unique iff the family (e;) with the above properties is. 


Proof. Claim 1 clearly follows from no. 3. To prove (2.53), assume (2.42), so that 
||; — @;|| = sup{|(Yi,ayi) — (Wj,ay;)|, a € BH), |lal] = 1}. (2.57) 


Clearly, |(y,ay)| <1 when |lall = ||yl] = 1, hence |(yi,ayi) — (y;,ay;)| <2, 
and the upper bound ||@;— @;|| = 2 in (2.57) is reached iff |(wi,ay1)| = 1 and 
(W2,aW2) = —(W1,aW). By Cauchy—Schwarz, this holds iff ay; = Ay as well 
as a\ = —Awo for some A € T. If yw; L yj, then this is accomplished by the 
operator a = |yj)(ywi| — |w;)(w;|; note that o(a) = {—1,1} for dim(H) = 2 and 
o(a) = {—1,0,1} for dim(H) > 2, so indeed ||a|| = 1 by (A.47). If, on the other 
hand, (yj, y;) #0, then no a with ||a|| = 1 can meet these eigenvalue equations. 
One way to see this is to reduce to H = C’, since a in (2.57) can be replaced by eae, 
where e is the projection onto the linear span of y; and y;. Picking a basis of Cc? 
(with say Vv; = Y;), the two eigenvalue equations for a yield a matrix representation 
of a, from which ||a||* = ||a*a|| may be computed by calculating the eigenvalues of 
a*a and using (A.47). This gives ||a|| > 1 unless (yj, y;) = 0. 

One direction of the proof of the third claim easily follows from Theorem 2.7: 
any spectral decomposition (2.6) of p provides the projections 


e,= |v;) (vi | (2.58) 


of the proposition. For example, eq. (2.54) comes down to [p,e;] = 0, which is 
the case iff e; commutes with all spectral projections of p, which clearly holds for 
(2.58). Uniqueness of the e; then corresponds to uniqueness of (2.6) and hence to 
non-degeneracy of the non-zero eigenvalues p; of P, as in Proposition 2.14. 

The opposite direction, i.e., proving that (2.58) exhausts all possibilities for 
(2.53) - (2.54), is based on the GNS-construction and requires an entire subsection. 
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2.4 The GNS-construction for matrices 


The proof of Proposition 2.15 may be completed on the basis of the GNS-construction 
began in §1.5, which in this subsection we develop for A = B(H), where, as usual, 
dim(H) < o, In that case, we may use Theorem 2.7 to simplify matters. 

First, to prove (1.76) we use (2.33) and cyclicity of the trace, compute the trace 
by summing over a basis (v;) of eigenvectors of a*a, say a*av; = [;V;, where [; > 0 
by positivity of a*a, and use (A.47) (for a*a rather than a) to obtain: 


o(b*a*ab) = Tr (pb*a*ab) = )"(v;,bpb*a*av;) =) ui(v;, bpb* v;) 


i 


< |la*al| )0(v;,bpb* v;) = |lal|’Tr(pb*b) = |la||*@(b*b), 


where we used (v;,bpb*v;) = (b* v;, pb*v;) > 0 to justify the inequality. 
We now explain all cases of interest, paying special attention to the commutant 


Tw (A)’ = {B € B(H) | To(a)B = BIg(a)Va € A}; (2.59) 


to distinguish operators on H from operators on Hw, we write the latter in capitals. 
For simplicity we also put H = C” (with the standard inner product), so that 


B(H) =M,(C), (2.60) 


and all operators are matrices. Performing a suitable unitary transformation or 
change of basis if necessary, we also assume that the unit vectors v; in the spec- 
tral decomposition (2.6) of p form (all or part of) the standard basis (01,...,V,) of 
C”. As in (1.74), we denote the null space by 


Np = {a € B(H) | Tr(pa*a) = 0}. (2.61) 
e If p = |v,) (v;|, the corresponding pure state (2.42) is @(a) = (v;,av;), with 
Np = {a€A| av; =0}. (2.62) 


Hence a € Np iff the j’th column C;(a) of a vanishes, so we have a—b € Np iff 
C;(a) = C;(b). Thus the equivalence class ag € M,(C)/Np may be identified with 
Cj(a). Consequently, we obtain 


Hp =M,(C)/Np =C", (2.63) 
under the unitary isomorphism u: Hp — C", ap ++ C;(a), with inverse utizie dp, 
z€C", where a is the matrix with C;(a) = z and zeros elsewhere (i-e., aj; = z; and 
dix = 0 for all i and k 4 7). We likewise write uiw= bp, with bj; = w; and bx =0 
for all i and k 4 j. With uap = z and ubp = w, we obtain (beware: no sum over /!): 


(ap, bp) = Tr(pa*b) =o ij a = (z,w) cr = (udp, ubp) cn 
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The GNS-representation Zp, originally given on Hp by (1.77), is accordingly trans- 
formed to utp (a)u~! = ftp on C”, which is given by 


Tp (a)w = uNp(a)bp = u(ab)p = C;(ab) = aw, 


and the cyclic vector uQp) € C” is just the basis vector v; from which we started. 
More generally, for a pure state (2.42) the GNS-representation 1, (Mn(C)) is equiv- 
alent to the defining representation on C”, with canonical cyclic vector y. Finally, 
since only multiples of the unit matrix commute with all matrices, it follows that 


Toy (Mn(C))! = C. (2.64) 


e The ‘opposite’ case occurs when p is invertible, in other words, when the sum 
over i in (2.6) has n nonzero terms. Hence 


t (pa*a) > pillav,||? (2.65) 


vanishes iff av; = 0 for each i, i.e., a = 0, so that Ny = {0} and hence 
Hp = M,(C). (2.66) 
The GNS-constructed inner product on M,,(C), cf. (1.78), given by 
(ap, bp) =Tr(pa*b), (2.67) 
may be transformed into the usual one (2.34) by the following linear map: 


u:M,(C) > M,(C); (2.68) 
udp = appl. (2.69) 


This map is unitary from the Hilbert space (M,,(C),(-,-)p) to the Hilbert space 
(M,(C), (-,+)), for it is invertible, with inverse u~!a = app~1/? 


, aS Well as isometric: 
(u(a),u(b)) = Tr (p'/2a*bp'/?) = Tr(pa*b) = (ap, bp). 
The transformed representation fp = ump (a)u~! on M,,(C) is simply given by 
tty (a)b = ab, (2.70) 
and the cyclic vector uQp) in M,(C) becomes p!/2, so that, as in (1.73), 
(p'/?, ftp(a)p') = Tr(pa). (2.71) 
In this case, the commutant is easily computed to be 


ftp (Mn(C))’ = M,(C), (2.72) 
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since any linear map C : M,(C) — M,(C) that satisfies C(ab) = aC(b) for each 
a,b € M,(C) is of the form C(a) = ac = R,(a) for some c € M,,(C), namely c = 
C(1); to see this, just take b = 1. Since this involves right multiplication R, by c, 
which messes up the order in that R-Rg = Rac, one has a choice in implementing the 
isomorphism (2.72) either as a linear anti-homomorphism (of algebras) C ++ Re, or 
as an anti-linear homomorphism C ++ R-« (see also Theorem C.159). 

Further insight into the structure of this representation comes from the realization 


M,(C) =C"@C", (2.73) 
as Hilbert spaces under the unitary map v: a+> ));; aij 0; @ vj. This yields 
vitp(a)v" =a@1n, (2.74) 
as an operator on C” ® C”, and indeed for any Hilbert spaces H;, Hz one has 
(B(A1) &)C-14,)' =C- 14, ®)B(A2). (2.75) 


e Finally, in the ‘intermediate’ case the sum in the spectral decomposition (2.6) has 
1 <m <n nonzero terms. Using the ensuing (partial) basis (01,...,Um) of C” (viz. 
C”), analogously to (2.66) with (2.73) we obtain, up to unitary equivalence, 


Hy = C"@C"; (2.76) 
To (a) ~a® Ins (2.77) 
Q = Y Vp; ® v3; (2.78) 
i=1 
Tp (Mn(C))! & Mn(C). (2.79) 


The relevance of all this to the decomposition of states on B(H) is as follows. 


Proposition 2.16. Let @ be a state on B(H) = M,,(C). Then each decomposition 


o = )' Pid, (2.80) 


where the p; are probabilities (but the states @; are not necessarily pure) is induced 
by a family (A;) of nonzero operators in the commutant T(B(H))’ that satisfy: 


0<A; <1; (2.81) 
VvAi =|. (2.82) 


Namely, given such a family of operators Aj, the decomposition (2.80) is given by: 


Pi = (Q0,Ai Qo); (2.83) 


oy (Qo, To (a)AiQo) 
Q;(a) = (Gy AQey (2.84) 
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Proof. The claim that such a family yields (2.80) is trivial, except for the remark that 
automatically p; > 0, since (Qg,A;Qw) =0 would imply \/A;Qe = 0 and hence 


/Aido = VAiRo(a) Qo = ha(a)/Ai Qo = 0 


for any a € B(H); by (1.72) this gives \/A; = 0 and therefore A; = JA; =0. 
Conversely, each state @; in (2.80) defines a sesquilinear form Q; on Hg by 
Qi(aw,bw) = @;(a*b), which is well defined by @;(a*a) < @(a*a) and (A.1), and 
is positive because @; is a state. Proposition A.23 then provides us with a positive 
operator A; for which Q;(d@,0@) = (dw,Aibe), hence @;(a*b) = (aw, Aiba). Next, 


(dw, Aika(c)bo) = (aa,Ai(cb) @) = @;(a*cb) = ((c*a)@, Aiba) = (do,a(c)Aiba), 


so Aj € 1@(B(A))’. Finally, the bound (2.81) corresponds to 0 < p; < 1 in (2.80), 
whilst @(1) = 1, or equivalently Y; p; = 1, yields (2.82). 


We now complete the proof of Proposition 2.15. We assume (2.33), where we 
initially take p to be invertible. We omit the hat in (2.70) as well as the suffix 
or Pp on vectors. As noted, we then have Q5 = pl/ 2, and we also know that A; is 
given by A;b = ba; for some a; € M,(C), viz. a; = Ajl, (where 1, = ly is to be 
distinguished from Q,) = p!/?). In this case, (2.81) means 0 < Tr(b*ba;) < 1 for 
each b with Tr(b*b) = 1, which is true iff 0 < a; < 1, whereas (2.82) immediately 
yields Y;a; = 1. In terms of such a family (a;) in M,(C) itself, the decomposition 
(2.80) of @ = Tr(p—) into arbitrary states @; follows from (2.83) - (2.84) as 


Pi= Tr (pai); (2.85) 
;(a) = Tr (pia); (2.86) 
1/2. 71/2 
_ Pp’ aip 
PS apa (2.87) 


To obtain pure and orthogonal states @;, we subsequently ask when the new density 
matrices p; are mutually orthogonal one-dimensional projections p; = |v;)(vj|. 

To answer this, we use the spectral theorem (A.37) - (A.38) applied to p, which 
gives p = )); pjej and hence pi/2 = Lj VPje;, so that 


pap? =) pipe jaiek. (2.88) 
jk 


This can only be proportional to a one-dimensional projection if each a; is a one- 
dimensional projection that commutes with all spectral projections e; of p (and 
hence also commutes with p itself), and all further constraints on the a; may then 
only be satisfied if a; = |v;) (v;|, for some basis (v;) of eigenvector 0; of p. 

A similar analysis applies to non-invertible p, the only new point being that pro- 
jections e; orthogonal to the range of p fall into the null space Np, cf. (2.76) - (2.79), 
and hence do not contribute to (2.52), so that they may be ignored. 
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2.5 The Born rule from Bohrification 


The Bohrification approach to quantum mechanics studies noncommutative alge- 
bras of observables like B(H) through their commutative subalgebras. In this section 
we show how the Born rule (2.8) emerges from that perspective. Our discussion is 
based on the interplay between the three kinds of (finite-dimensional) C*-algebras: 


e C(X) is a C*-algebra under the pointwise operations (1.18) - (1.20) and the 
supremum-norm (1.24); we still assume that X is finite. 

e B(H) is a C*-algebra under the pointwise operations (2.22) - (2.24) and the op- 
erator norm (A.18); our standing assumption remains dim(H) < %. 

e C*(a) is the C*-algebra generated by a € B(H) and 1y (i.e., the intersection of all 
unital C*-algebras in B(H) that contain a). If a* = a, then C* (a) is commutative. 


Each of these is unital, since C(X) has a unit ly (i.e. the function x ++ 1), B(H) 
has a unit 1y (i.e. the operator y+> yw), and C*(a) shares the unit 14. The first 
two classes overlap just in case dim(H) = | and X is a singleton (in which case 
B(C) = C(x) = C); otherwise, the fundamental difference between the two is that 
C(X) is commutative in that fg = gf for all f,g, whereas B(H) is non-commutative. 
However, the system of C*-algebras C*(a) within B(H), where a € B(H)sa varies, 
to some extent bridges the gap between the commutative and the non-commutative 
worlds. This relatively simple situation goes to the heart of exact Bohrification. 


Theorem 2.17. Let a* =a € B(H), where H is a finite-dimensional Hilbert space. 


1. The commutative C*-algebra C*(a) consists of all polynomials in a. 
2. Any element of C* (a) is a linear combination of the spectral projections ey, of a. 
3. For functions f : (a) + C, the map f > f(a) defined by 


fa= ¥ f(a)-e,. (2.89) 


REo(a) 
gives a (necessarily unital) isomorphism of commutative C*-algebras 
C(o(a)) = CM! = C*(a). (2.90) 


Proof. Noting that any function on the finite subset o(a) of R is continuous, this is 
a restatement of Theorem A.15 for finite-dimensional Hilbert spaces. 


We now come to the main point. States on unital C*-algebras A may be defined 
just as in Definitions 1.14 and 2.5, i.e. as positive linear functionals @ : A > C that 
satisfy @(14) = 1 (cf. Proposition C.5). Recall Theorem 1.15 and Theorem 2.7. 


Theorem 2.18. Let @ be a state on B(H), represented by a density operator p via 
(2.33), and let a € B(H) be a self-adjoint operator. Then the restriction of @ to 
C*(a) C B(A) is a state, which also induces a state @c(o(a)) on C(O(a)) through 
(2.89) - (2.90), £.e., @c(o(a))(f) = @(f(a)). The probability measure on o(a) that 
corresponds to the state @c(g(a)) on C(G(a)), then, is given by the Born rule (2.9). 
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Proof. First, the restriction of a state on a given unital C*-algebra to a unital C*- 
subalgebra remains a state. Second, isomorphisms of unital C*-algebras pull back 
to state spaces in that, if @: A — B is an isomorphism, and @ is a state on B, then 
o*@:A — Cisa state on A, where p*(a) = @(Q(a)). We now compute 


Mc(o(a)) (F) = O(f(@)) = Tr(pf(@)) 


= Ep,(f), (2.91) 


where, from left to right, the first equality is just the definition of @jc(o(a)), whereas 
the others in turn follow from (2.33), (2.89), (2.8), and (1.9), respectively. 


Note that Theorem 2.18 implies Theorem 2.2. The simplest nontrivial illustration is: 


H=C’; (2.92) 
O = Oy; (2.93) 

- dou (2.94) 
a = diag(Ay,...,A =ya uj) (u;|, (2.95) 


with respect to the standard basis (u;) of C”, with all A; € R different, cf. (2.42). The 
C*-algebra C* (a) = C” then consists of all diagonal matrices 


b = diag(by,...,bn). (2.96) 
Since obviously 
o(a) = {A,,...,An}, (2.97) 
the isomorphism (2.90) is given by 
fr diag(f(A1),---,f(An)). (2.98) 


The computation (2.91) in the proof of Theorem 2.18 then becomes 
Myic(o(ay) (Ff) = (Wo diag(f(A1),-.-,f(An))W) = ¥ lei? F(A 
i=l 
=) Paldi)f (Ai), (2.99) 


from which the Born probabilities pg may be read off as the familiar expressions 


Pa(Ai) = |cil?. (2.100) 
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For an analogous treatment of the generalized Born rule (2.21), we first refer to 
Definition A.16 for the the pertinent definitions, especially of the joint spectrum 


o(a) Co(a1) X--- X O(a,) C R" 


of a family a = (a,...,a,) of commuting self-adjoint operators. As in the case of a 
single operator, we define C*(a) as the smallest unital C*-subalgebra of B(H) that 
contains each a;. Generalizing Theorem A.15, we have: 


Theorem 2.19. Let a = (a),...,d,) be commuting self-adjoint operators on H. Then 
C*(a) is commutative, and there is a unique isomorphism of C*-algebras 


C*(a) =C(o(a)), (2.101) 


under which 1 € C*(a) corresponds to the unit function 1 g(a): 4+ 1 inC(o(a)), 


and a; € C*(a) corresponds to the projection 7: A.-+ A; in C(a(a)). 
For further discussion, see Appendix A, Theorem A.17. 
Theorem 2.18 may then be generalized in the following way, with similar proof. 


Theorem 2.20. Let @ be a state on B(H), represented by a density operator p, and 
let a = (a1,...,dn) be commuting self-adjoint operators on H. Then the restriction 
of @ to C*(a) C B(A) is a state, which induces a state @\c(g(a)) on C(O(a)) through 
the isomorphism (2.101). Then the probability measure on the joint spectrum 0 (a) 
that corresponds to @c(g(a)) is given by the generalized Born rule (2.21), i.e., 


Pa(A) = Tr(pey). (2.102) 


Strictly speaking, in the present context one should restrict (2.21) to A € o(a), but 
the claim is correct even if one does not, for the (Born) probability assigned to values 
A € o(a\) X +++ X O(a,) that do not lie in o(a) is simply zero. 

As shown in Proposition A.19 in Appendix A, the multi-operator case is a spe- 
cial case of the single-operator case, in that C*(a) = C*(a) for a suitable self-adjoint 
operator a. Since the converse is obvious, Theorems 2.18 and 2.20 are equivalent. 
Corollary A.20 in Appendix A even shows that any unital commutative C*-algebra 
C in B(#) takes the form C = C*(a) for some self-adjoint operator a € B(H). Com- 
paring the restrictions of a state @ on B(H) to C as the latter varies therefore comes 
down to asking how the various Born probability distributions pz on C*(a) are re- 
lated to each other as a varies. It is clear from (2.8) that if pg and p, come from the 
same density operator p (as the notation indicates), then for A € o(a) and up € o(b), 


e\) = ol?) = pa(A) = po(p). (2.103) 


Indeed, this is the only compatibility condition between p, and p,, showing that 
Pa(A) only depends on a and A through the associated spectral projection e), Con- 
dition (2.103) is a version of a general property of quantum mechanics called non- 
contextuality, which in this case means that, given its spectral projection e, the 


‘context’ operator a is otherwise irrelevant for the Born probability p,(A). 
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2.6 The Kadison-Singer Problem 


It should be clear from the example in the previous section that pure states My on 
B(H) may well give rise to mixed states on C*(a); referring to (2.94) and (2.100), 
this is the case whenever c; 4 0 for more than one value of the index i. If, on the other 
hand, c; 4 0 for just a single value i= j, then y =u; (up to a phase), or, equivalently, 
@y(a) = (uj,au;). In that case, the given state @y is pure both on B(H) and on 
C*(a), and the associated probability measure @yjc(g(q)) on the spectrum o(a) is 
supported by a single point, namely A; € o(a). 

This example suggests a general problem (first posed in the non-trivial case 
where #7 is infinite-dimensional by Kadison and Singer in 1959) that is of great 
relevance for the Bohrification program. Namely, let A be a maximal commutative 
unital C*-algebra in B(H) and let @, be a pure state on A. We may then ask: 


1. Does @, have an extension to a state @ on B(H) at all (i.e., @4 = @,)? 
2. If so, is @ uniquely determined by its restriction @,4? 
3. Either way, if @ exists, can it be chosen so as to be pure (assuming @, is)? 


If dim(H) < ©, all these questions are easy to answer at one stroke: 


Theorem 2.21. Let dim(H) < °% and let @, be a pure state on a maximal commu- 
tative unital C*-algebra A in B(H). Then @, has a unique extension to a state @ on 
B(H), which is necessarily pure. 


Proof. As explained after the proof of Corollary A.20 in Appendix A, we may sim- 
ply assume that H = C” and that A consists of all diagonal matrices; call this col- 
lection D,(C) (for every other case is unitarily equivalent to this one). Clearly, 


D,(C) =C", (2.104) 
from which we see that if @, is pure, then it must be given on b € D,(C) by 
@,(b) = b;, (2.105) 


for some j, cf. (2.96). If @ exists, itis given by (2.33). Using (2.6), condition (2.105) 
then enforces the following constraint on the p; and v; (where (u;) is the standard 
basis of C” and (v;) is an orthonormal set diagonalizing the density operator p): 


Ypil(uj, vi)? = 1. (2.106) 


Since ¥;; pj = 1 and | (uj, vj)| < 1, eq. (2.106) can only hold, for given j, if 
(uj, Vi)| = 1 (2.107) 


for all i with p; > 0. Since u; is a unit vector whilst the (v;) are an orthonormal set, 
(2.107) can only be true if there is a single i for which p; > 0, namely i = j (and 
hence p; = 1), in which case v; must equal u; up to a phase. Hence p = |u;) (u;|, 
which shows that p exists, is unique, and is pure. 
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At least in operational interpretations of quantum mechanics, this theorem implies 
that a pure quantum state (i.e., on B(H)) is completely determined by the outcome 
of a measurement of some maximal observable a, whose outcome, after all, gives 
one of the eigenvalues A; in (2.95) and hence fixes the post-measurement state to be 
the one given by (2.105). This is, indeed, a typical way of preparing a state. 

As one might expect, this is no longer true if A = C*(a) fails to be maximal (in 
which case a measurement of a would not provide enough information about the 
quantum state). Namely, suppose a = )') <¢(a) A -e,, as in (A.37); the maximal case 
occurs iff Tr (e, ) = dim(H, ) = 1 for all A € o(a) (equivalently, all eigenvalues A; in 
(A.37) are different). If not, suppose dim(H, ) > 1 for some 2. Then any unit vector 
y © Hy gives rise to a pure state @y on B(H), which remains pure on A (it is given 
by @yj4(a) =A and hence induces the Dirac probability measure 6, on o(a)). 

Dropping the purity condition on @, loses uniqueness of the extension @, too, 
even if A is maximal: take b = diag(b1,...,b,) € A= D,(C), and assume that 


oa(b) =) pid (2.108) 


has more than one term (with p; > 0 and );; p; = 1 as always), cf. (2.105). Then: 


e any pure state Wy as in (2.94), such that \c;|* = p; for all i, extends @a; 
e the “decohered” mixed state @ = Y; p;|v;)(v;| extends a, too. 


Further insight in the state extension problem comes from the following result. 


Proposition 2.22. Let A be any unital C*-algebra in B(H) (i.e., A is not necessarily 
commutative) and let @, be a pure state on A. Then the set 


Sa = {@ € S(B(A)) | 4 = Os} (2.109) 


of all states on B(H) whose restriction @), to A is the given state Ma, is a compact 
convex subspace of the total state space S(B(H)) of B(H), whose extreme boundary 
0S, consist of pure states on B(H), i.e., 0.S4 C P(B(H)). Consequently, @, has a 
unique extension to a state on B(H) iff it has a unique pure extension. 


Proof. Convexity and (w*) compactness are obvious. Let @ € 0,54 and suppose 
@ =t@ + (1—1t)@ for some t € (0,1) and @,@2 € S(B(H)). By assumption, 
Wa = W4 =t@i\4 + (1 —1)@y)4 is pure on A, So @y)4 = @y|4 = Ma, hence @), ) € Sa. 
Since @ € 0,Sa, this implies @ = @ = @. Hence @ is pure on B(H). 

Finally, $4 is a singleton iff its boundary 0,5, is (since any state in S,4 has a 
convex decomposition in terms of states in its boundary), yielding the last claim. 


This proposition remains true for infinite-dimensional H (and even for arbitrary 
C*-algebras), but Theorem 2.21 becomes much more complicated. As we shall see, 
maximal commutative unital C*-subalgebra of B(H) are no longer unique up to 
unitary equivalence, and the validity of the claim depends on which type of maximal 
subalgebra is considered. Also, the proof of what then is called the Kadison—Singer 
Conjecture becomes extremely difficult (with questionable relevance to physics). 
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2.7 Gleason’s Theorem 


Gleason’s Theorem answers the following question in the positive: given probability 
distributions p, on o(a), for each self-adjoint operator a € B(H), satisfying (2.103), 
is there a single state @ on B(H) inducing these probabilities through the Born rule? 
This question is closely related to various others that involve equivalent structures, 
cf. Definition 1.1. We denote the unit sphere in H by H; = {yw € H,||y|| = 1}, and 
write A(H) = {e € B(H) | e” =e* =e} for the set of all projections on H. 


Definition 2.23. Let H be a finite-dimensional Hilbert space, with unit sphere H,. 
1. A probability distribution on A(H) is a map p : H; — (0, 1] that satisfies 
dimH 
y p(v;) = 1, for any basis (v;) of H. (2.110) 
i=l 
2. A probability measure on Y(H) is a map P: P(H) = (0, 1] that satisfies: 
P(e+f) = P(e) +P(f) whenever ef =0 eH L fH; (2.111) 
P(ly) =1. (2.112) 


Note that p is really defined on Y;(H), for we have p(zv) = p(v) for all z € T and 
v € HM; to see this, extend zv and v to a basis of H in the same way and use (2.110). 
As in Definition 1.1, these notions of probability are equivalent, cf. (A.28): 


e Given a probability measure P, one obtains a probability distribution p by 
p(v) =P(ep). (2.113) 
e Given a probability distribution p, Lemma 2.24 below guarantees that 
P(e) = P(vi), (2.114) 
i=1 
where (v;) is any basis of eH, defines a probability measure P. 


Lemma 2.24. If p is a probability distribution on Y(H) and L C H is a linear 
subspace, with basis (v;), then yo () p(v;) is independent of this basis choice. 


Proof. Extend (v;) to a basis of H by adding a basis (v‘) of L+. Take another basis 
(v;") of L and complete it to a basis of H by using the same basis (v‘) of L+. Then 


Yr(vi) +¥ r(v}) = V (v/’) +E po} = 1, (2.115) 


where we once again used (2.110). Hence ¥; p(v;) = Yj p(v/’). 
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Clearly, a state @ on B(H) induces a probability measure P on Y(H) by 
P(e) = o(e) =Tr(pe), (2.116) 


where p is the density operator associated to @, as in (2.33). Therefore, it is a natural 
question if any probability measure on “(H) is induced by some state on B(H) by 
(2.116). This question is equivalent to the one above: 


Proposition 2.25. e A probability measure P on Y(H) induces non-contextual 
probability distributions pg on o(a) for each self-adjoint a € B(H) by 


pa(A) = P(e); (2.117) 


e Conversely, a family (pq) of non-contextual probability distributions (i.e. satisfy- 
ing (2.103)) gives rise to a probability measure P on P(H) by 


P(e) = pe(1). (2.118) 
Proof. As defined by (2.117), pa is a probability distribution on o(a): by (A.38), 
Y pla)= Y P(e?) >| y “) =P(Iq)=1. (2.119) 
Reo(a) 2Eo(a) 1€0(a) 


Conversely, suppose ef = 0. Introduce g = 1 — e — f, and consider the self-adjoint 
operator a = Aje+Aof +Asg, for three different real numbers Aj, A2,A3. By (2.103), 


P(e) = pe(1) = palAr), PP) = pel) = Pala); P(8) = Pe(1) = PalAs). 


Furthermore, since 6 (a) = {A),A2,A3}, we have pa(A1) + pa(Az) + pa(As) = 1 and 
hence P(e) +P(f)+P(g) =1. Also, Pie+ f)+P(g) =P(e+f+g)=P(lx)=1. 
The last two equations give P(e+ f) = P(e) +P(f). 


Suppose (an is a family of projections on H such that )';e; = 1y and eje; = 
6;je;. Such a family generates a commutative unital C*-algebra C = C*(e1,...,e) 
in B(H), which coincides with C*(a) for a = ¥;Aje;, where all A; € R are differ- 
ent, so that o(a) = {A,,...,Ay}. All commutative unital C*-algebras in B(H) arise 
in this way, and C is maximally abelian iff N = dim(A), i-e., iff each e; is one- 
dimensional. The point is that a probability measure P on #(H) induces a state @c 
on each C = C*(e1,...,en) (or, for C = C*(a), a probability measure P, on o(a)): 


1. if a € Cis self-adjoint, then we have unique spectral resolutions (A.37), and put 


@c(a)= ) AP(e,). (2.120) 
A€o(a) 


2. ifc =a+ib €C witha and b self-adjoint, we define @c(c) = @c(a) + i@c(b). 


By Lemma 2.24, the map @c thus defined coincides with the linear extension of the 
map e; ++ P(e;) to C, which also shows that @c in linear. Clearly, @c is a state on C. 
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Again by Lemma 2.24, the ensuing family of states @c on all commutative unital 
C*-algebras C C B(H) is non-contextual (or, one might say compatible) in the sense 
that if b € CNC’, then @c(b) = @c(b). In particular, if C’ C C, then @cic' = Wc 
(where Oc\c' is the restriction of @c to C’). It is convenient to extend this non- 
contextual family (@c) of states to a well-defined map @ : B(H) — C by putting 


(a+ ib) = Oc, (a) + ie) (b), a,b € B(H), a" =a,b"=b. (2.121) 
Definition 2.26. A quasi-state on B(H) is a map @ : B(H) > C that is positive 
(@(a*a) > 0) and normalized (@(1y) = 1), cf. Definition 2.4, and otherwise: 

1. satisfies @(a) = w(a') +iw(a"), where a’ = $(a+a*) anda" = —ti(a—a’). 
2. is linear on each commutative unital C*-algebra in B(H). 


Note that a’ and a” are self-adjoint, so that @ is fixed by its values on B(H),,. Hence 
we have @(za) = z@(a), z€ C, and @(a +b) = @(a) + @(b) whenever ab = ba. 
Proposition 2.27. The map @ : B(H) — C defined by (2.120) and (2.121) is a quasi- 
state on B(H). Any quasi-state on B(H) arises in this way, giving a bijective corre- 
spondence between quasi-states on B(H) and probability measures on P(H). 


Proof. The first claim holds by construction. Conversely, a quasi-state @ yields a 
probability measure P via P(e) = @(e), cf. (2.116). 


Theorem 1.15 shows that each state on C(X) is induced by a probability measure 
(and, trivially, also the other way round). Although Theorem 2.7 is already a quan- 
tum version of Theorem 1.15, an even better parallel would involve the probability 
measures of Definition 2.23. This is indeed what Gleason’s Theorem achieves, en 
passant answering all versions of our lead question: 


Theorem 2.28. Let H be a finite-dimensional Hilbert space of dimension > 2. Then 
each probability measure P on Y(H) is induced by a unique state @ on B(A) via 


P(e) = @(e). (2.122) 
Equivalently, each probability distribution p on P(H) is given by 


p(v) =(v,pv), (2.123) 


where p is a unique density operator on H. Hence every quasi-state is a state. 
This completes the following list (of which 1-5 do not require Gleason’s Theorem). 


Corollary 2.29. Let H be a finite-dimensional Hilbert space. The following notions 
are equivalent (i.e., there are natural bijective correspondence between): 


1. Non-contextual families of states on commutative unital C*-algebras C C B(H); 
2. Non-contextual families of probability measures on spectra 0(a), cf. (2.103); 

3. Probability distributions on Y(H); 

4. Probability measures on P(H); 

5. Quasi-states on B(H); 

6. States on B(H). 
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2.8 Proof of Gleason’s Theorem 


The difficulty of Theorem 2.28 should already be clear from the fact that it is false 
if dim(H) = 2: as we have seen in (2.37), a state on M2(C) = B(C7) is given by 
three real parameters, whereas a probability measure P on (C7) can assign arbi- 
trary values P(e) to one-dimensional projections e, as long as P(1 — e) = 1 — P(e). 
Equivalently, this time from the perspective of probability distributions p, each unit 
vector in C* belongs to a unique basis (up to a phase), so that p can assign an arbi- 
trary value to one of the two vectors in each basis and is unconstrained otherwise. 

In higher dimensions, however, one-dimensional projections always belong to 
infinitely many orthogonal sets, whilst unit vectors belong to infinitely many bases. 
This constrains the possible values P or p may take, and these constraints turn out 
to be strong enough to enforce (2.116). 

The proof of Theorem 2.28 consists of two nontrivial parts, the second of which is 
notoriously difficult. By exception in quantum-mechanical reasoning, both involve 
R? as a real Hilbert space, whose elements x = (x,y,z) have standard inner product 


(x,x’) = xx! + yy’ +22’, (2.124) 
with the ensuing (Pythagorean) norm and (Euclidean) notion of orthogonality. 


Proposition 2.30. If Theorem 2.28 holds for the real Hilbert space R°, then it holds 
for any complex finite-dimensional Hilbert space of dimension > 2. 


Proposition 2.31. Theorem 2.28 holds for the real Hilbert space R?. 


Proposition 2.30 is a conjunction of two lemmas. 


Lemma 2.32. /f (2.123) holds for R?, where p is some symmetric operator, then 
(2.123) holds for C?, where p is a self-adjoint operator. 


Neither positivity nor normalization of p play a role in the argument; once we have 
(2.123) in this more general sense, the conclusion that p be a density operator triv- 
ially follows from the definition of p. This also applies to the second sublemma. 


Lemma 2.33. [f (2.123) holds for C?, then it holds for for any complex finite- 
dimensional Hilbert space of dimension > 2. 


It will be convenient to extend p : Hi — [0,1] to a function Q: H > R by 
Q(0) = 0; (2.125) 
Ov) = IIv\l’p (5) (v 40), (2.126) 


so that (2.123) is evidently equivalent to the analogous expression 


O(yv) =(W.py) (weEH). (2.127) 
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Given (2.127), the minimax principle for real symmetric matrices implies that Q is 
maximized on H by w € A, iff pw = Ay, where A is the largest eigenvalue of p. 
Proof of Lemma 2.32. Suppose p : C} — [0,1] is a probability distribution (in the 
sense of Definition 2.23). The first step shows that p assumes a maximum on the unit 
sphere Cc? (note that c is compact, but we do not know yet if p is continuous!). 
Since 0 < p(v) < 1 for v € C}, M = sup{p(v), v € C}} exists, and there is a 
sequence (V,) in C} for which p(v,) + M. Since C} is compact, this sequence has 
a convergent subsequence, with limit v. € cy Furthermore, we may assume that 
(Un, Veo) € R, for if not, we change to V/, = Zp Vn With Zp = (Veo, Un) /|(Vn, Veo) |- 

For each fixed n (with v, in the convergent subsequence in question), the real 
linear span of v.. and U, is isomorphic to R? as a Hilbert space (with standard inner 
product), embedded in any R? C C? one likes (where, once again, R? is seen as areal 
Hilbert subspace in the sense that all inner products of vectors in R? are real). By 
assumption, (2.123) holds on IR? and hence also on R? C R?, so that, in particular, 


| P(Veo) = P(Vn)| = | (Deo, P Vee) — (Vn; PUn}| = |((Vee — Vn); P (Veo + Vn)) | 
S ||P |||] Vee + Vn ||| Vee — Vn] < 2|[P |||] Vee — Vn, 


since || Veo + Vpl| < || Deol] + || Val] and || V.0|] = || V»l| = 1. Consequently, 


|P(V0) —M] < |p(Veo) — P(Un)| +|P(Vn —M| < 2||pP|||| Veo — Val] + |P(Un) — MI, 


so letting mn —> co makes both terms on the right-hand side vanish. Hence p(v..) = M. 

For reasons to become clear soon, we relabel v.. = Vv). Take any Up € C3 with 
(v9, 01) = 0 and consider the real Hilbert space R? C C? spanned by 0, and vo. By 
assumption, (2.127) holds, and by the minimax principle, pd; = AD) = p(v1) 01, 
with p(v,) = M. Hence for any v = top +1101, with fo,t; € R, we have 


Q(v) = (toV9 +1101, P (toVo +1101)) = \to|" p(vo0) + \t1|7p(v1). (2.128) 


We claim that this also holds for complex coefficients fo,f; € C. Indeed, by (2.126), 
O(tovo +1101) = |t1 | (= Tol + »:) = |to|7p(vo) +|t1|7p(v1), (2.129) 


where we used (2.128) with v4 = (to/ti)/|(to/t1)|Vo instead of vo; this is still a 
vector orthogonal to v1, and we also used Q(v4) = p(0)) = p(Vo). 

We now repeat this analysis on the part (C?) Lv, of Cc? that consists of all unit 
vectors orthogonal to v1, which remains compact. Thus p assumes a maximum at 
some unit vector V2 € (C}):»,, and we may complete the pair (v1, 2) to a basis 
(v1, 02, 03) of C2. With v9 = t202 +1303, the above argument (on (Ci) gives 


p(v9) = O(vo) = |h|*p(v2) + |t3|?p (v3). (2.130) 


Combined with (2.129) at fo = 1, this gives, for any coefficients ft ,f2,t3 € C, 
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Q(t v1 +02 +8303) = |n|?p(v1) + |p|? p(v2) + |t3|?p(v3). (2.131) 


Hence (2.127) holds on all of C3, with 


P = p(v1)|V1)(V1| + p(V2)|V2) (V2 + p(V3)|V3) (V3). 


Proof of Lemma 2.33. Let H be a complex finite-dimensional Hilbert space of di- 
mension > 3, equipped with a probability distribution p, and define Q: H — R by 
(2.125) - (2.126). We need to prove (2.127) for some self-adjoint operator p. By 
Propositions A.4 and A.23, this is equivalent to Q being a quadratic form. Since 
(A.8) evidently holds, we just need to prove (A.9). Take any three-dimensional 
Hilbert space L3 C H containing v and w. By assumption, there exists a self-adjoint 
operator p;, on L3 for which (2.127) is valid for all wy € L3. Taking y=v, y=w, 
yw=v-+w, and yw = v—w then validates (A.9). This completes the first proof. 
This lemma may also be proved without invoking Proposition A.4, as follows. 
If v and w are linearly independent, they are contained in a unique two-dimensional 

subspace Lz C H, which in turn is contained in a (non-unique) three-dimensional 
subspace L3 C H. Take pz, as above and define a bilinear form B on Ly by 
B(v,w) = (v,pi,w). Defining the associated quadratic form Q by (A.7), we see that 
(2.125) - (2.126) hold, from which we also conclude that B is independent of the 
choice of L3 D Lz. If v and w are linearly dependent, a similar argument shows 
that B is independent of the choice of the subspace Ly containing v and w. Hence 
B:HxH — C is well defined, and to conclude that it is a self-adjoint form we 
need to check that B(v,Aw +x) = AB(v,w) + B(v,x) for all vy,w.x EV, A EC, cf. 
Definition A.1. If v,w, and x are linearly independent, this can be done by passing 
to the unique three-dimensional subspace L4 C H containing these vectors. If they 
are not, we are already done by the previous step. Finally, given that B is a bilinear 
form, a self-adjoint operator p may be reconstructed from Proposition A.23, upon 
which (2.127) holds by construction. 


Proposition 2.31 again follows from two lemmas by modus ponens. 


Lemma 2.34. Any probability distribution on R? (vf. Definition 2.23) is continuous. 


Lemma 2.35. Any continuous probability distribution in R? satisfies (2.127), for 
some self-adjoint operator p. 


The operator p obtained by Lemma 2.35 is necessarily positive and automatically 
has unit trace. Another way to phrase this is to take the complex linear span of all 
probability distribution on the unit sphere R} = S* in R’; this yields a vector space 
F (S*), whose elements are called frame functions. These are bounded functions 


fF SC, 
with the property that for any basis (u;,u2,u3) of R* one has 


f(ui) + f(u2) + f(us) = w(f), (2.132) 


2.8 Proof of Gleason’s Theorem 65 


where w(f) € C does not depend on the basis and is called the weight of the frame 
function f. For a probability distribution p we obviously have w(p) = 1. The natural 
norm on .A(S7) is the supremum-norm inherited from C(S*), and like the latter, 
F (S$?) is closed in this norm (and hence is a Banach space in its own right, a fact 
that will play an important technical role in Lemma 2.40 below). 

As for probability distributions, (2.132) implies a lemma that will often be used: 


Lemma 2.36. /f (u,,u) is a basis of some two-dimensional linear subspace of R°, 
then f (uy) +f (uz) is independent of the choice of this pair. Hence if C is some great 
circle in S? and wy | uy for uy, € C, then f(u;) + f(u2) only depends on C. 


Furthermore, by similar arguments any frame function is even, i.e., f(—u) = f(u). 

The proof of Lemma 2.34 will actually show that every frame function on S” 
is continuous, whilst the proof of Lemma 2.35 will establish the property that any 
continuous frame function on S? satisfies (2.127), for some self-adjoint operator p. 


Proof of Lemma 2.34. Let f : S* — R be a frame function (the complex-valued case 
follows by decomposing f into a real and an imaginary part). Since constants are 
frame functions, adding a constant to f if necessary we may assume 


inf{ f(x),x € S’} =0. (2.133) 
Hence for given € > 0 there exists p € S? with 


f(p) < €/2. (2.134) 


Performing a rotation if necessary, we may assume that p = (0,0,1) is the north 
pole. It is useful to introduce another frame function g : S* > Rt by 


8(x) = f(x) + f(Re(4/2)x), (2.135) 


where R,(7/2) is the (counter-clockwise) rotation around the z-axis by an angle 

m/2. It is easy to see that g is constant on the equator E: for x € E, consider the 

basis (x,R,(/2)x,p) of R>, so that g(x) = w(f) — f(p) is independent of x. 
Furthermore, for any U C S? consider the oscillation of f at U, defined by 


Oscy (f) = supy (f) —infy(f) = sup{f(u),u € U} —inf{f(u),ue U}. (2.136) 


If, for given x € S*, for any € > 0 there is a neighbourhood U C S? of x on which 
Oscy (f) < €, then | f(x) — f(u)| < € for all u € U, so that f is continuous at x. 
The lengthier steps in the proof of Lemma 2.34 are now as follows: 


Lemma 2.37. Given that g(p) < €, there is an open set U C S? on which 
Oscy(g) < 3€. 


Lemma 2.38. For any non-negative frame function h, if Oscy (h) < €! for some open 
U, then each point x € S” has a neighborhood V where 


Oscy (h) < 4e’. 
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Assuming these lemmas (to be proved below), continuity of f easily follows: 
1. Lemmas 2.37 and 2.38 applied to h = g and x = p yield Oscy(g) < 12€ for some 
neighbourhood V of p. Now g(p) < €, hence inf{g(v),v € V} < €, hence 


supy(f) < supy(g) < Oscy(g) +infy(g) < 13¢. 


2. Since f > 0 and hence 0 < infy(f) < supy(f), this yields Oscy (f) < 13¢. 

3. Applying Lemmas 2.38 to h = f and U = V gives that each point x € S* has a 
neighborhood W where Oscw(f) < 52e. 

4. Hence | f(x) — f(w)| < 52e for all w € W. Since € > 0 was arbitrary, it follows 
that f is continuous at x, and since x was arbitrary, f is continuous on all of S*. 


For p#u €N, ie., the open northern hemisphere, let Cy be the unique great 
circle through u with one (and hence both) of the following equivalent properties: 


e the point of greatest latitude on Cy is u; 
e Cy, cuts the equator E at two points that are both orthogonal to u. 


We write Dy = Cy NN, and for each z € N, we introduce the set 


DD, = {x €N | dy € Dx,z € Dy}. (2.137) 


Geometrically, DD, consists of the points x on the northern hemisphere from which 
z can be reached by “double descent’, where we say that y € N may be reached 
from some point x at higher latitude by (single) descent if y € Cy. The proof of our 
lemmas relies on the following two facts from spherical geometry (stated without 
proof, as they have nothing to do with frame functions, though the second is easy). 


Lemma 2.39. /. The set DD, in (2.137) has open interior. 
2. For any x € S* there exists y € E such that x lies on the equator Ey relative to y 
regarded as the north pole (so in this terminology, E = Ep). 


Proof of Lemma 2.37. By definition of the infimum, for each € > 0 there exists z € N 
such that 
infg < g(z) Sinfg + €. (2.138) 


The open U in question will be the interior of DD,. The crucial inequality is 
g(x) < g(z)+2€ (x € DD,), (2.139) 


which together with (2.138) yields infy g < g(x) < infy g + 3e for each x € DD,, 
whence Oscy (g) < 3€. So we need to prove (2.139), given the assumption g(p) < €, 
which is immediate from (2.134) and (2.135). 

To prove (2.139), take r € N ands €C,NE, sor 1 s and hence 


g(r) +g(s) < w(g). (2.140) 


Furthermore, take t,u € EZ, t L u, so that (t,u,p) is a basis and, g being a frame 
function, we have 
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g(t) +g(u) +e(p) =w(g). (2.141) 


But by construction g is constant on the equator E, so g(t) = g(u) =k, hence 2k + 
g(p) = w(g), and (2.140) yields 


g(r) < w(g)—8(s) =2k+8(p)—g(s) =k+a(p), 
from which 
k— g(r) > —g(p). (2.142) 
Furthermore, for q €¢ N, x,r € Dg, x Lr, there exists qe Dg OE such that 
g(x) +8(r) = (a) +8(q') = 8(a) +k, 
from which, using (2.142), we obtain 
g(x) =8(q) +k—g(r) = (gq) —8(p), 


and hence 
g(q) < 8(x)+8(P), QE N,x € Dg. (2.143) 


Aplying this twice to the double descent definition domain (2.137), we find 


g(x) < g(y)+28(p) < g(z) +28(p), y € Dx,z € Dy. (2.144) 


Since (2.134) and (2.135) imply g(p) < €, this yields (2.139). 


Proof of Lemma 2.38. We may assume p € U = Up. Using Lemma 2.39.2, by the 

argument to come we then move Uy to a neighborhood of y called Uy, and subse- 

quently repeat the argument so as to move Uy to Ux = V as specified in the lemma. 
We use spherical coordinates (@,@) for x = (x,y,z) € S*, given by 


(x =cos @ sin, y = sing sin 8@,z =cos@), @ € [0,27), 6 € [0,z]. (2.145) 


Hence the north pole p = (0,0,1) has @ = 0 and @ undefined (note that (@,6) are 
essentially (longitude, latitude), except that the latter usually starts counting down- 
wards from +z to —47, with the north pole having latitude 57). Since U is open, 
there exists 6 > 0 such that all points with 0 < 6 < 6 belong to U. Pick y € E as 
above, and define r as the point with the same @ as y but 0, = @y + 56 (so that r lies 
a little south of y). Then inspection of S* shows that one can find a neighborhood 
Uy of y with the following property: for any u € Uy there exists a great circle C 
through r and u that contains two further points r’ € Up and u’ € Up such that r Lr’ 
and u | uw’. Hence A(r) + h(r’) = h(u) +: h(u’). Doing this for two different points 
u=w, and u = wW gives 


1 (ui) +h(uy): 
h(x) +h(ry) = h(uz) + h(uy). 


Hence h(u,) —A(uz) = h(r,) — A(r4) — (h(u),) — A(u4)), from which we obtain 
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|A(uy) — h(uz)| < |A(ry) —A(rd)| + |(A(a}) — A(uy)| < Oscy (A) + Oscy(h) < 2€", 
for by assumption, Oscy (h) < €’. Since u; and up in Uy were arbitrary, this gives 


Oscy, (t) < 2e’. (2.146) 


Repeating this with y as the north pole gives Oscy, (h) < 4€’, i.e., the lemma. 


To prove Lemma 2.35, following Gleason himself we consider the natural action 
of the rotation group SO(3) (with positive determinant) on R?, written R : x +> Rx. 
This action maps S? onto itself and hence induces an action U on C(S”) by pullback: 


U(R)f(u) = f(R'u). (2.147) 
By Lemma 2.34 we have inclusions 
F (S*) cCo(S*) C C(S"), (2.148) 


where .¥(S”) are the frame functions and C,(S) consists of the even functions in 
C(S?); both spaces are obviously stable under the action (2.147). The following 
facts, due to Weyl, which we state without proof, follow from elementary represen- 
tation theory, but they are also quite easily verified by explicit computation. Let 


We(x,y,z) = (xt+iy)’, ZEN, (2.149) 


and restrict this function to S”, still calling it ye. Let Hp C C(S*) be the vector space 
spanned by all transforms U(R) yw, R € SO(3). This vector space: 


e consists of all homogeneous polynomials of degree @ that are orthogonal (with 
respect to the inner product in L?(S”)) to any such polynomials of degree ¢ — 2; 

e has a basis consisting of the spherical harmonics Y;”,m = —é€,—€+1,...,€—1,¢; 

e accordingly, has finite dimension equal to dim(H;) = 2¢+ 1; 

e is irreducible under the natural SO(3)-action (2.147). 


Indeed, all (necessarily finite-dimensional) irreducible representations of SO(3) 
arise in this way. Now ¥(S*) is closed under the SO(3)-action (2.147), hence so 
must be .¥ (S?) Hy. Since H, is irreducible, there are merely two possibilities: 


Hy c F(S"); (2.150) 
Hy 0 F(S") = {0}. (2.151) 


Since for even/odd values of @ the space Hy consist of even/odd functions, and 
F (S*) only has even elements, we immediately see that (2.151) applies if @ is odd. 
For even values of ¢, we see at once that (2.150) holds for: 


e €=0, where the constant frame function f (x,y,z) =c = {w(f) £0 is obviously 
induced by the operator p = c- 13 (where 13 is the 3 x 3 unit matrix), cf. (2.127); 
e ¢=2, which corresponds to frame functions f with weight w(f) =0. 
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The latter functions are induced by operators p with zero trace. To see this, diago- 
nalize p in C? as in (2.6), without the constraints on p;. This yields 


f(x) = px) = = Sri x, v;)|*. (2.152) 
For f € Hy, since Hy Hy in L*(S*) we must have 
(1p3,f) 12(52) = ie d°x f (x) =0. (2.153) 
For any v € C?, we have 
4 
[@xis,v)? = Sho? (2.154) 
82 3 


to see this, write | (x, v)|? = |vx|7x? +|v,|?y? + |v,|7z, and use the surface element 
d?x = dgd@ sin @ associated to the spherical coordinates (2.145) to compute 


4a 
DD, 212. 2. <2 
[4 xx =[i4 xy = [id xXZ= 3" (2.155) 


Therefore, from (2.152), noting that || v;||? = 1 for each i = 1,2,3, we obtain 


4 4a An 
[4 x f(x) = 3 bei = =z Tr(p). (2.156) 


To settle the case @ > 4, all we need to know about the spherical harmonics is 
that if 2 is even, then, once again using spherical coordinates, one has 


Y/" (x,y,z =0) ~ e'”® (m even); (2.157) 
Y/" (x,y,z = 0) = 0(m odd). (2.158) 


If (2.150) holds, then Y/" € (S*) for eachm = —0,—€+1,...,€—1,4. But for any 
(even) £ > 4, there are values of m for which Y;” cannot be a france face To see 
this, take the following family of bases of R°, indexed by @: 


u; = (cos¢,sin@,0); (2.159) 
uz = (—sin@,cos@,0); (2.160) 
= (0,0, 1). (2.161) 


For any frame function f, the value of f(u1) + f(u2) =w(f) — f(u3) must therefore 
be independent of @. However, from (2.157) - (2.158), we find 


Y}" (uy) + ¥7"(uz) ~ eld + ein(o+m/2) = eM (1 ae ae 
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which is independent of @ iff m = 0 or m = 2 (mod 4). For € = 0,2 these are indeed 
the only values that occur, but as soon as @ > 4, the value m = 4 (among others) will 
ruin it. So (2.150) holds only for 2 = 0 and @ = 2, whereas (2.151) is the case for all 
other ¢ € N. Since Ho and H) occur in C(S*) with multiplicity one, they cannot have 
greater multiplicity in .(S*) C C(S*), so the above argument suggests that 


F (S”) =H) @ Mp, (2.162) 


which would prove the lemma. Fortunately, this is indeed the case, but to complete 
the argument we need the following technical results (left out by Gleason himself): 


Lemma 2.40. /. Frame functions are uniformly continuous. 

2. The representation (2.147) of SO(3) on F(S*) is continuous (in the usual sense 
that the map (R, f) > U(R)f from SO(3) x F(S") to ¥(S*) is continuous) with 
respect to the supremum-norm on F (S?). 

3. A continuous representation of a compact group G on a Banach space B is com- 
pletely reducible (in that B is the closure of the direct sum of all irreducible 
representations of G that it contains). 


Proof, 1. The first claim follows because S* is compact. Another proof starts from 
the proof of Lemma 2.38, which has the feature that for given ¢’ > 0, if y,y’ € E 
with y’ = R,() for some angle @, then Uy = R,(@)Uy (this is immediately clear 
from the geometry). Similarly, as x € S*, different neighborhoods V = U, are 
related by a rotation. Hence the size of U; is independent of x, so that the above 
proof of continuity established uniform continuity of frame functions also. 

2. Let Rn > Rin SO(3) and fn > f uniformly in F(S”), i.e., || fn — ||. > 0. Then, 
subtracting and adding a term U(R,,)f and using isometricity of U, ice., 


I|U (Rn) (Fm — f) Ilo = |I.fm — flee, 


we obtain the estimate 


|U (Rn) fm — U(R)f lleo S lI fm — flee + IU (Rn) f —U(R)f leo, 


cf. (2.147). As m — o the first term on the right-hand side vanishes by assump- 
tion, whilst the second vanishes as n + c by uniform continuity of /. 

3. This is a Banach space version of the Peter-Wey] theorem, applied to the Banach 
space of frame functions equipped with the supremum-norm (see Notes). 


Something like this is necessary, because one needs to rule out the possibility that 
although (by the Stone—Weierstrass Theorem) the polynomial functions on R?, re- 
stricted to $?, are uniformly dense in C(S7), so that the linear span of all spherical 
harmonics and hence of all Hy is uniformly dense in C(S*), some frame functions 
might lie in the closure of this direct sum (or, in other words, they are given by uni- 
formly convergent infinite sums of certain Y;”). Lemma 2.40 clinches the proof of 
(2.162), since the third part implies that (S*) would contain all irreducible repre- 
sentations that contribute to the potential infinite sums; but we have already proved 
that it only contains Hp and H2. Thus Lemma 2.35 now also follows. 


2.9 Effects and Busch’s Theorem 71 


2.9 Effects and Busch’s Theorem 


Gleason’s Theorem is easy to state but difficult to prove; Busch’s Theorem is a 
variation of it, which is more difficult to state but much easier to prove. Logically, 
Busch’s Theorem is weaker than Gleason’s, as the assumptions of the latter are con- 
tained in those of the former, but physically it appears to be more useful, as it covers 
more situations. To wit, Busch’s Theorem revolves around certain generalizations 
of projections (which took the centre stage in Gleason’s Theorem) called effects: 
these are (necessarily self-adjoint) operators a € B(H) that satisfy 0 <a < ly, in 
the sense defined after Proposition A.22. Thus a € B(H) is an effect iff 


0< (way) <1 (weH). (2.163) 


The set of effects on a Hilbert space H is denoted by &(H) or by [0,!]a(4)- By 
Theorem A.10, we have (2.163) iff a* = a and the eigenvalues A of a lie in the 
interval [0, 1] (i.e., o(a) C [0, 1]). This implies that ||a|| < 1, and conversely, if a > 0, 
using the bound a < |la||- 14 for any self-adjoint operator a, which easily follows 
from (A.47), we see that for a > 0, the condition ||a|| < 1 is equivalent toa € &(A). 
In particular, it follows that both projections and density operators are effects. 


Proposition 2.41. 1. The set &(H) of effects on H is a compact convex subset of 
B(H) in its o-weak topology, with extreme boundary 


0.8(H) = P(H), (2.164) 


i.e., the set of all projections on H (including 0). 
2. Each a € &(H) has a (typically non-unique) extremal decomposition 


m 
a=)\ufi, (2.165) 
i=0 


in which t; > 0 and Y';t; = 1, and the fj are projections. 


The o-weak topology on B(H), defined after Corollary A.31, is the right one in this 
context, but if H is finite-dimensional, as we assume here, this technicality may be 
ignored, as the claim is even true with respect to the norm topology. 


Proof. In Part 1, compactness and convexity are easily checked. 
The inclusion 0.6(H) C Y(H) is equivalent to the claim that any a € &(A), 
a¢ P(A), does not lie in 0e& (H) and hence admits a convex decomposition 


a=ta;+(1—f)az, t € (0,1),a1,a2 € €(A),a1 FaF ay, (2.166) 


or, equivalently, a has a nontrivial decomposition a = )';t,a;, for certain t; > 0 with 
yt: = 1. Indeed, the latter follows from the spectral resolution (A.37), in which the 
spectral projections e, should be rescaled if necessary to as to make the coefficients 
sum to unity (note that te € &(H) for any projection e and any t € (0, 1)). 
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To show the opposite inclusion Y(H) C 0.&(H), again assume (2.166), where 
this time a = e € A(H) isa projection. “Sandwiching” between y € Hj, this yields 


(Weaiw) = (W,ary) =0, ye (eH)*; (2.167) 
(Way) = (W,ay) =1, weeH. (2.168) 


Using 0 < a; < 1, i= 1,2, and (A.37), these equations imply that a; = az =e. 

The claim of part 2 is satisfied by picking the 7; and f; in terms of the spectral 
data associated to a (cf. Theorem A.10), as follows: with m = |o(a)|, order the 
eigenvalues A € o(a) according to Ay < --- < Ap, and take: 


to = 1—Am: (2.169) 

th =A: (2.170) 

te = Ai-—Ai-1 (i> 2); (2.171) 

fo = 90; (2.172) 

fi = 1h; (2.173) 

fi= ee (i> 2). (2.174) 
= 


The validity of (2.165) is then a trivial verification. 


Note that, in general, the extremal decomposition of a as an effect differs from its 
spectral resolutions (A.37) or (A.38) as a self-adjoint operator. If a = p is a den- 
sity operator, then the latter, i.e., (2.6), does provide an extremal decomposition of 
a construed as an effect also, which differs from the one in (2.165). This example 
shows that extremal decompositions in &(H) are not necessarily unique. Also, ob- 
serve that te, fore € A(H) and t € (0, 1), does not lie in 0.&(H), since it admits a 
nontrivial decomposition te = te + (1 —f)-0, recalling that0 © YA(H) Cc E(A). 
Busch’s Theorem classifies the following objects. 


Definition 2.42. A probability distribution on &(H) is a function p: &(H) — [0,1] 
that satisfies the following two conditions: 


Pig) = 1: 
2. If a (finite) family (a;) of effects satisfies Ya; < 14, then 


P (E+) =) pai). (2.175) 


Lemma 2.43. if a (finite) family (a;) of effects satisfies Y,a; = 1, then Y; p(ai) = 1. 


This trivial observation implies that a probability distribution on &(H) induces a 
probability distribution on Y(H) C &(H) by restriction, cf. Definition 2.23. An- 
other way to see this from the perspective of probability measures is to note that 
any family (e;) of projections that satisfies );e; < 1 is automatically orthogonal. 
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Therefore, restricted to Y(H), Definition 2.42 reduces to Definition 2.23.2. To see 
this, fix j and pick y € e;H. The condition )’;e; < | gives 


Lew) =Y llewll? <0, 
iZij iZj 


but since each term is positive, this implies e; yy = 0 for each i ¢ j. Putting y = e;9, 


where ¢ € A is arbitrary, this gives e;e;@ = 0 for all @ and hence e;e; = 0. 
Clearly, any state @ on B(H) induces a probability distribution p» on &(H) by 


Po(a) = (a). (2.176) 
Busch’s Theorem shows the converse. 


Theorem 2.44. Any probability distribution p on &(H) takes the form p = Pq for 
some state @ on B(H), establishing a bijective correspondence between probability 
distributions on &(H) and states on B(H). 


Proof. If p: &(H) — [0,1] can be extended to a linear map @ : B(H) — C, then 
@ is automatically a state, for normalization is assumed and positivity follows from 
the fact that any 0 4 b > 0 has the form b = ra for some r € R* and 0 <a < ly, 
namely with r = ||b|| and a = b/||b||; then a > 0 and ||a|| = 1, so that, as explained 
earlier, a is an effect. Hence @(b) = @(ra) = rp(a) > 0. To achieve this extension: 


1. We show that p(ra) = rp(a) for all r € QM [0,1] and 0 < a< Iq. Indeed, for 
any such a and n € N we write a = (a+---+a)/n (n terms), so that by (2.175), 
p(a) =np(a/n). Similarly, for any m € N and 0 < b < 1y/m, we have p(mb) = 
mp(b). Take integers m,n such that (m/n) € [0,1] and put b = a/n, so that 


p (a) =mp (*) — ™ pa). (2.177) 

n 

2. We next prove that p(ta) = tp(a) for allt € [0,1] andO0 <a < 1y. Positivity of p 
yields p(a) < p(a’) whenever 0 < a <a’ < ly. Given t € [0,1], take an increas- 
ing sequences of rationals (r,,) with r, <t, as well as a decreasing sequence of 


rationals (s,,) with t < s,, such that r, tt and s, | ¢ in R. With step 1, this gives 


Tnp(4) = P(tna) S plta) < Plsna) < snp(a). 


Letting n — ©, this gives tp(a) < p(ta) <tp/(a), and hence equality. 

3. Now extend p to all a > 0, calling the extension @, by @(a) = ||a||p(a/|lal|) 
at a £0 and @(0) = 0; the previous step then easily yields the compatibility 
property ©\(0,1),,,,, = P and the scaling property ta) =ta@(a) for eacht > 0. 

4. For a > 0 and b > 0, rescaling and (2.175) yield @(a+b) = @(a)+ a(b). 

5. For general a* = a we write a = a, —a_, with ax > 0, as in Proposition A.24, 
and define @ on all of B(H)sa by @(a) = @(a+) — @(a_). This is well defined 
despite the lack of uniqueness of (A.74), for if a=a,—a_ =a', —a!_, with 


a’, > 0, then a, +al_ =a’, +a_, whence @(a+) — @(a_) = @(a',) — a@(a_). 
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This argument also shows that @ remains linear on general self-adjoint a and b, 
since a+b = (a, +b;) —(a_+b_) is a decomposition with (a; +b.) > 0. 

6. Finally, for general c € B(H) we (uniquely) decompose c = a+ib, a* =a, b* =b, 
cf. the proof of Corollary A.20, and put @(c) = @(a) +i@(b). 


To close, we give a very brief and superficial introduction to effects as they arise 
from modern (“operational”) quantum measurement theory. This theory associates 
quantum data to classical data through the concept of a Positive Operator Valued 
Measure or POVM. Relative to some given “classical” space X (taken finite here) 
and Hilbert space H (assumed finite-dimensional), a POVM is defined as a map 


A: P(X) > &(H) (2.178) 


that satisfies A(X) = 1y as well as A(UU UV) = A(U) +A(V) whenever UNV = 9, 
cf. Definition 1.1. Equivalently, a POVM is a map 


a:X > &(H) (2.179) 
that satisfies 

y\ a(x) = 1a. (2.180) 

xEX 


a(x) =A({x}); (2.181) 
A(U) = ¥ a(x). (2.182) 


The motivating special case of a POVM is given by some self-adjoint operator 
a € B(H), which yields X = o(a) and a(A) = eg. In that case, each density operator 
p induces a probability distribution on o(a) through the Born rule (2.8). More gen- 
erally, a probability distribution p on &(H) and a POVM (2.179) jointly determine a 
probability distribution p, on X, given by 


Pa(x) = p(a(x)). (2.183) 


Indeed, pa(x) > 0 because a > 0, and Yycx Pa(x) = 1 by (2.180) and Lemma 2.43. 
The idea, then, is that a measurement of some POVM a has (classical) outcome x 
with probability pa(x); this generalizes the traditional dogma that a measurement 
of an observable a has outcome A € o(a) with (Born) probability (2.8). Indeed, 
combined with (2.33), Busch’s Theorem shows that we necessarily have 


Pa(x) = Tr(pa(x)), (2.184) 


for some density operator p. So nothing has been gained by introducing Definition 
2.42, expect perhaps for the insight that, as in Gleason’s Theorem, it is the non- 
contextuality of a probability distribution on &(H)—in that p(a(x)) is independent 
of the POVM a which a(x) forms part of—that eventually enforces (2.184). 
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2.10 The quantum logic of Birkhoff and von Neumann 


In §1.4 we showed that classical mechanics has a classical logical structure, in 
which (equivalence classes of) propositions correspond to subsets of phase space. 
These subsets form a Boolean lattice in which the logical connectives —=, A, and V 
fornegation, disjunction, and conjunction, respectively, are interpreted as their nat- 
ural set-theoretic counterparts (i.e., complementation, intersection, and union). 

In 1936, Birkhoff and von Neumann proposed a strikingly similar guantum logic 
for quantum mechanics, in which (closed) linear subspaces of Hilbert space play 
the role of (measurable) subsets of phase space, and the basic logical connectives 
(except implication, which is queerly lacking in this setting) are interpreted as: 


aL =L'; (2.185) 
LAM =LnOM; (2.186) 
LVM =L4+4M, (2.187) 


where L*+ is the orthogonal complement of L, see (A.29), L™M is the (set-theoretic) 
intersection of L and M, and L+M is the (closed) linear span of L and M. If 
dim(H) < c9, as we continue to assume, any linear subspace of H is automatically 
closed, and the infinite-dimensional case an attractive operator-algebraic and lattice- 
theoretic structure arises only if the events are taken to be closed linear subspaces. 
Although the Brouwer-—Hilbert debate on the foundations of mathematics had 
somewhat subsided in 1936, with hindsight it may be argued that the quantum 
logic of Birkhoff and von Neumann (who had been a “postdoc” avant la lettre with 
Hilbert) was predicated on their desire to preserve not only the law of contradiction 


a\-a=L, (2.188) 


where @ is any proposition and is the proposition that is identically false, but also, 
against Brouwer, the law of excluded middle (or tertium non datur) 


aVv-Aa=T, (2.189) 


where T is the proposition that is identically true. Indeed, in the Birkhoff—von Neu- 
mann model (2.185) - (2.187), where | = {0} and T = H, these are identities. 
Similarly, their model satisfies the law of double negation 


ana = a, (2.190) 


which both in classical logic (where it is a tautology) and in intuitionistic logic 
(where it is rejected in general) is equivalent to (2.189). Also, De Morgan’s Laws: 


(aV B) =-7aA-p; (2.191) 
(aNB) = 7aV-B, (2.192) 


hold in their quantum logic (despite their origin in classical propositional logic). 
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We will now derive the Birkhoff—von Neumann structure along similar lines as its 
classical counterpart (cf. §1.4), except that in the absence of the necessary structure 
for a classical propositional calculus we now rely on semantic entailment alone. 

In quantum theory, the role of functions f : X — R as observables in classical 
physics is played by self-adjoint operators a : H —> H on some Hilbert space H, and 
hence the quantum analogue of an elementary proposition f € A of classical physics 
isa € A (where A C R), with special case a = A for a € {A} (with A € R). 

In analogy to the points x € X of phase space, pure states Wy as in (2.42), or 
the corresponding density operators ey (where y € H is a unit vector), yield truth 
assignments to elementary propositions. To start with the simplest case, a = A is: 


e true with respect to Wy iff pi (A) = 1, see (2.10), or, equivalently, iff y € Hy, 
where H, CH is the eigenspace of a for eigenvalue A, cf. (A.36); 
e false with respect to @y iff py (A) =0, or, equivalently, iff y L Hy. 


The underlying idea here is arguably that, according to some naive operational in- 
terpretation of quantum mechanics, a measurement of a in a state Wy would give 
outcome A with probability one (zero) iff a = A is true (false) with respect to Oy. If 
0 < pi (A) <1, the “truthmaker” wy actually fails to assign a truth value toa =A; 
the partial nature of truthmakers marks a significant difference with the classical 
case, as does the closely related distinction between false and not true. Similarly, 
we say that an elementary proposition a € A is frue in some state @y iff 


PY¥(A) =lleaw||? =1, (2.193) 


cf. (2.9) and (A.42), and false if Pi’ (A) = 0. In other words, a € A is true in @y 
iff w € Hy, and false if yw | Hy,see (A.43). Such propositions may formally be 
combined using the connectives =, A, and V (whose meaning is unfortunately far 
from clear in this new setting) according to the same (inductive) formation rules as 
in classical propositional logic. However, the classical truth tables for A and V are 
unsound with regard to the above rules, at least if one eventually wants to arrive at 
(2.185) - (2.187). For example, @y may validate neither o nor B, yet it might make 
a V B true (assuming that @ and B correspond to L and M, respectively, this is the 
case if y ¢ Land y ¢ M, yet y © L+M). Similarly, @y may render neither @ nor B 
false, yet it may falsify @ A B. Due to this complication, the approach of §1.4 has to 
be modified, as follows. Our goal remains to define a semantic equivalence relation 
~H, which is predicated on an inductive definition of truth we first give. 


Definition 2.45. J. a € A is true in @y iff Pi’ (A) = 1, and false if Pj’ (A) =0. 

2. The negation (a € A) of an elementary proposition a € A is given bya € A‘. 
3. The negation 7a is true iff a is false. 

4. The conjunction & / B is true iff both a and B are true. 

5. De Morgan’s Laws (2.191) - (2.192) and the law of double negation (2.190) hold; 
in particular, the disjunction a\ B is true iff (7a A—B) is true (as per I-4). 
We write & Ex B iff the truth of a implies the truth of B, for each state Wy. 

We write a ~y B iff Eu B and B En a. 

8. Ifa ~H B, then > ~y —B. 


ND 
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Lemma 2.46. Definition 2.45 implies the following rules: 


1. Our earlier truth attributions for the case a € A with A = {A}. In particular, 
a =A is always false when 1 ¢ o(a), and so is a € A whenever AN o(a) = 90. 
2. a€ A is false relative to Wy iff ¥ L Hay. 


3. (a€ A) A(b ET) is true in Oy if ye HO nH), 
4. (a€ A)V(b ET) is true in Wy if yc HO +H), 


Hence conjunctions behave classically, as part 3 states that (a € A) A(b €T) is true 
iffa € A and b € I are true). The proof of this lemma uses the following notation. 


Definition 2.47. If e and f are projections on a Hilbert space H, then: 


e ef is the projection onto eH N fH; 
e e\ f is the projection onto eH + fH, i.e., the (closed) linear span of eH and fH. 


Note that if e and f commute, these reduce to the algebraic expressions 


eAf =ef: (2.194) 
eVf =e+f-—ef. (2.195) 


Furthermore, in case of potential ambiguity we will write e) for the spectral pro- 


jection e, as defined by a, and analogously ef), etc. Similarly for H) etc. 

Proof. The first and third claims are immediate. The second one follows from the 
relation egc = ex = 1—eg, or, equivalently, Hyzc = H ae For the fourth, use Defini- 
tion 2.45.6, 3, and 2 to infer that (a € A) V (b €T) is true iff (ac AS) A (DET) is 
false. From the third claim, we note that 


(ae A)A(bET) Xx (ef) re =1), (2.196) 
so by Definition 2.45.5, (a € A°) A(b EI‘) is false iff el) A el) = | is false. Since 


el) Ae”) = 1 is true iff ye HO AH, claim 2 implies e) el”) = 1 is false iff 


a b a b a b a b 
we (HO nH®)+ = (HS Nba (A ?)+)+ = (HM) ++ 4 (HO) +4 =H! +H), 


which finishes the proof. 


Quite analogously to the classical case, Definition 2.45 implies 
(a€ A) En (bE TL) iffe ce, (2.197) 


which, once again, immediately yields (a € A) ~q (b ET) iff et) = el), Taking 
b= el) and I” = {1}, analogously to (1.53), as in the above proof we have 


ac Ange” =1. (2.198) 
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Furthermore, as in the proof of Lemma 2.46 we find 


(ac A)A(bEL) x (ef? re =1); (2.199) 
(ae A)V (bE) x (e? vee) = 1) (2.200) 


Consequently, we have the following counterpart of Lemma 1.19: 


Lemma 2.48. Any elementary or composite proposition is semantically equivalent 
(relative to H) to one of the form e = 1, for some projection e. Furthermore, 


S(eaty an (et 2 1) (2.201) 
(e=1)A(f=1) nw (AF =D (2.202) 
(=1V (FH) ax (eV P=): (2.203) 


At last, the quantum version of Theorem 1.20 reads as follows: 


Theorem 2.49. The set 2(H) of equivalence classes |-|q of propositions generated 
by the elementary propositions a € A and the logical connectives =, \V, and /, is 
isomorphic to the set &(H) of linear subspaces of H, under the map 


0: 2H) 5 YH): (2.204) 
(ae Aly) = eH. (2.205) 


Under this isomorphism, the logical connectives —, /\ and \V turn into orthogonal 
complementation (-)4, intersection Q, and linear span +, respectively, in that 


9([-a]x) = 9([a]x)~ (2.206) 
(AA Bly) = (lH) N e([Blx: (2.207) 
g([aV Bla) = 9([e]x) + O([B]x), (2.208) 


Furthermore, if we define a partial order < on Q(X) by saying that |Q]n < [B]x iff 
a Ex B (which is well defined), then @ maps < into set-theoretic inclusion G, i.e., 


[ela < [Bln if pola) S 9([B]a).- (2.209) 


With respect to these operations, £ (H) is amodular lattice (granted that dim(H) < 
co; otherwise, the lattice is merely orthomodular, cf. 8D./ for terminology). 


Proof: Most of this is immediate from Lemma 2.48, expect for the last claim, which 
follows from simple computations (and from the Amemiya—Araki Theorem). 


As in the classical case, there is an algebraic reformulation of this result, obtained 
from the bijective correspondence between (closed) linear subspaces L of H and 
projections e on H, given by L = eH (see Proposition A.8). 
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Theorem 2.50. The set 2(H) of equivalence classes |-|q of propositions generated 
by the elementary propositions a € A and the logical connectives =, \V, and /, is 
isomorphic to the set P(H) of projections on H, under the map 


g': Q(H) > YH); (2.210) 
p'({a€ Alu) = e®?, (2.211) 


where (once again) P(H) is the set of all projections on H. 
Under this map, the logical connectives —, \ and \ turn into (cf. Definition 2.47): 


¢' ([-a]z) = 1— @"([a]x) (2.212) 
9 ([@A Blx) = 9'([e]a) Ae" ([Blx): (2,213) 
g'([aV Bla) = 9" ([a]z) V 9’ ([B]x), (2.214) 


Furthermore, 9! maps the partial order < on Q(H) into the partial order on P(H) 
defined by e < f iff eH C fH, or equivalently, iff ef =e. 
Finally, with respect to these operations, Y(H) is an (ortho)modular lattice. 


However, unlike (1.65) - (1.68), this result is somewhat unsatisfactory in not being 
purely algebraic. This may partly be remedied through expressions like 


eAf= lim (eo f)"; (2.215) 
eVf =1—((1—-e)A(1—f)), (2.216) 


where eo f = ef + fe, and the (strong) limit in (2.215) should be taken on fixed vec- 
tors y € H (upon which it exists in the norm-topology of H). Even so, this specific 
limit still relies on the underlying Hilbert space, and in any case the expressions fail 
to be purely algebraic and look pretty artificial. Indeed, the same may be said about 
Definition 2.45, which, of course, has been fine-tuned with hindsight in order to ob- 
tain the “desired” answer in the form of Theorem 1.20, which in turn vindicates the 
mathematically sweet Birkhoff—von Neumann Ansatz (2.185) - (2.187). 

In addition, there are serious conceptual objections to this kind of quantum logic: 


1. Conjunction A and disjunction V do not distribute over each other, rendering their 
interpretation as “and” and “or” obscure. 

2. There are propositions a and B (namely those for which @’([@]) and 9’ ([B]z) 

do not commute) for which the conjunction a / B is physically undefined. 

. There are states in which o@ V B is true whilst neither @ nor B is true. 

. There are states in which o@ A B is false whilst neither @ nor B is false. 

5. In view of Schrédinger’s Cat, one would expect the law of excluded middle 
(2.189) to fail in quantum mechanics, yet it holds in quantum logic (and this 
is possible because neither V nor — has any familiar logical meaning in it). 

6. Finally, nothing is said or done about propositions that are neither true nor false. 


Rw 


In Chapter 12, we will therefore replace the doomed quantum logic of Birkhoff and 
von Neumann by the intuitionistic logic of Brouwer and Heyting. 
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Notes 


All operator theory for this chapter may be found in Kadison & Ringrose (1983). 


§2.1. Quantum probability theory and the Born rule 

The Born rule was first stated by Born (1926b) in the context of scattering the- 
ory, following the earlier paper (Born, 1926a) in which Born omitted the absolute 
value squared signs (corrected in a footnote added in proof). The application to the 
position operator is due to Pauli (1927), who merely spent a footnote on it. The gen- 
eral formulation is due to von Neumann (1932, 8111), following earlier contributions 
by Dirac (1926b) and Jordan (1927). Both Born and Heisenberg acknowledge the 
profound influence of Einstein on the probabilistic formulation of quantum mechan- 
ics. However, Born and Heisenberg as well as Bohr, Dirac, Jordan, Pauli and von 
Neumann differed with Einstein about the fundamental nature of the Born probabil- 
ities and hence on the issue of determinism. Indeed, whereas Born and the others 
just listed after him believed the outcome of any individual quantum measurement 
to be unpredictable in principle, Einstein felt this unpredictability was just caused 
by the incompleteness of quantum mechanics (as he saw it). See, for example, the 
invaluable correspondence between Einstein and Born (2005). 

Mehra & Rechenberg (2000) provide a very detailed reconstruction of the histor- 
ical origin of the Born rule within the context of quantum mechanics, whereas von 
Plato (1994) embeds a briefer historical treatment of it into the more general setting 
of the emergence of modern probability theory and probabilistic thinking. For the 
earlier history of probability see Hacking (1975, 1990). See also Landsman (2009). 


§2.2. Quantum observables and states 
Proposition 2.10 is due to von Neumann; see also Chapter 6. 


§2.3. Pure states in quantum mechanics 

This kind of thinking goes back to von Neumann (1932) and Segal (1947ab). 
§2.4. The GNS-construction for matrices 

Again, see §C.12 for the GNS-construction in general. 
§2.5. The Born rule from Bohrification 

See notes to §4.1. 
§2.6. The Kadison-Singer Problem 

The Kadison—Singer Problem was first discussed in Kadison & Singer (1959). 
See the Notes to §4.3 for more information. 


§2.7. Gleason’s Theorem 
§2.8. Proof of Gleason’s Theorem 

Gleason’s Theorem is due to Gleason (1957), whose proof we largely follow, 
with some simplifications due to Varadarajan (1985) and Hamhalter (2004). Lemma 
2.40.3 or some analogous result is lacking from these references; it may be found 
in Lyubich (1988), Chapter 4, §2, Theorem. It is often claimed that Gleason’s proof 
has been superseded by the more elementary one due to Cooke, Keane, & Moran 
(1985), which avoids all use of harmonic analysis. A similar proof, following up on 
Cooke et al but using constructive analysis only, was given by Richman & Bridges 
(1999). However, both because Gleason’s use of rotation invariance is very natural, 
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and also since the proof of Cooke et al has already been presented and simplified in 
two monographs entirely devoted to Gleason’s Theorem, viz. Dvureéenskij (1993) 
and Hamhalter (2004), as well as in the highly efficient book by Kalmbach (1998), 
we prefer to return to the original source (and add some technical details). 


§2.9. Effects and Busch’s Theorem 

Busch’s Theorem is from Busch (2003), whose proof we follow almost verbatim. 
See also Caves et al (2004). For the use of POVM’s in quantum physics see, e.g., 
Busch, Grabowski, & Lahti (1998), Davies (1976), Holevo (1982), Kraus (1983), 
Landsman (1998a, 1999), de Muynck (2002), and Schroeck (1996). 
§2.10. The quantum logic of Birkhoff and von Neumann Our discussion is based 


on Rédei (1998), with some modifications though. The original source is Birkhoff 
& von Neumann (1936). 


Chapter 3 
Classical physics on a general phase space 


Passing from finite phase spaces X to infinite ones yields many fascinating new phe- 
nomena, some of which even seem genuinely “emergent” in not having any finite- 
dimensional shadow, approximate or otherwise. Nonetheless, practically all results 
in the previous chapter remain valid, typically after the inclusion of some technical 
condition(s) that restrict the almost unlimited freedom allowed by infinite sets. 

One of these restrictions is that in classical physics we assume that our phase 
space X is locally compact Hausdorff, where we recall that a space is: 


e compact if every open cover has a finite subcover; 

e locally compact if every point has a compact neighbourhood; 

e Hausdorff (or T>) if every pair of distinct points x,y can be separated by open 
sets (i.e., there are disjoint open sets U,, Uy that contain x and y, respectively). 


This combination of topological properties turns out to be very convenient; it in- 
corporates spaces like R* (and more generally all non-pathological manifolds), or 
lattices like Z” (the price is that we exclude systems with an infinite number of 
degrees of freedom, such as classical field theories). A locally compact Hausdorff 
space X is regular in that each x € X and each closed set F C X not containing x 
can be separated by open sets (1.e., there are disjoint open sets U, 5 x and Ur D F). 

From the perspective of C*-algebras, the main advantage of using this particular 
class of spaces is that they are naturally singled out by Gelfand’s Theorem: 


Theorem 3.1. Every commutative C*-algebra A is isomorphic to Co(X) for some 
locally compact Hausdorff space X, which is unique up to homeomorphism. 


A proof may be found in Appendix C; here we just explain the notation and the 
main idea behind the proof (cf. Definition C.1, which we do not repeat). 

First, Co(X) is the set of all continuous functions f : X — C that vanish at infin- 
ity, i.e., for any € > 0 the set {x € X | |f(x)| > €} is compact, or, equivalently, for any 
€ > 0 there is a compact set K C X such that |f(x)| < € for all x ¢ K. For example, 
if X =R, then f(x) = exp(—x”) lies in Co(R). If X is compact, then Co(X) = C(X). 

Second, Co(X) is a vector space under pointwise operations (including pointwise 
complex conjugation as the involution), and is a Banach space in the sup-norm 


© The Author(s) 2017 83 
K. Landsman, Foundations of Quantum Theory, 
Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3_3 


84 3 Classical physics on a general phase space 


Il f llc = sup{| f(x) |}- (3.1) 
xEX 


The space X making A isomorphic to Co(X), then, is the Gelfand spectrum ¥ (A) of 
A, which we already encountered (cf. Definition 1.4) as the set of nonzero algebra 
homomorphisms from A to C. This set turns out to be a locally compact Hausdorff 
space in the topology of pointwise convergence, and the isomorphism A —- Co(X) is 
the Gelfand transform a ++ a, where d(@) = @(a). Conversely, if X is given, then 
we associate the commutative C*-algebra Co(X) to it, as in Chapter 1. 
Generalizing Definition 1.14, as a special case of the notion of a state we have: 


Definition 3.2. A state on Cy(X) is a positive (and hence bounded) linear functional 
@ :Co(X) > C with ||o|| = 1. 


If X is compact, given positivity one has ||@|| = 1 iff @(1x) = 1, cf. Lemma C.4. 
The appropriate generalization of Theorem 1.15 then reads (cf. Corollary B.21): 


Theorem 3.3. Let X be a locally compact Hausdorff space. There is a bijective cor- 
respondence between states on Co(X ) and probability measures on X, namely 


9(f) = a duf, f €Co(X). (3.2) 


Moreover, pure states correspond to Dirac measures and hence to points of X. 


In particular, a nonzero linear functional @ : Co(X) — C is multiplicative iff it is a 
pure state. This recovery of probability measures on phase space as states of the as- 
sociated algebra of observables Co(X ), and of points in phase space as the associated 
pure states, already familiar from the finite case, remains of great importance. 

As in quantum mechanics, many interesting observables in classical mechanics 
fail to be bounded, let alone Co; coordinate functions (on non-compact phase spaces) 
and the usual kinetic energy are a case in point. This is not a serious problem, es- 
pecially not if, as we shall assume from now on, X is a (smooth) manifold (those 
unfamiliar with this notion may always have X = Ré in mind). In that case, there is a 
very natural class of (typically unbounded) functions on X, viz. C*(X) =C*(X,R), 
which form a commutative algebra just like Co(X) = Co(X,C), and provide the (al- 
gebraic) basis for the theory of symmetry and dynamics in classical physics, as we 
shall now show (the fact that functions in C*(X ) may be freely added and multiplied 
provides a major simplification compared to unbounded operators in quantum me- 
chanics, even self-adjoint ones, which are most easily treated by transforming them 
into bounded ones, as discussed in §B.21). In fact, the most natural mathematical 
setting of classical physics is not operator theory, or even symplectic geometry (as 
even mathematically minded people used to think until the 1980s), but rather the 
more general and flexible framework of Poisson geometry, to which we now turn. 
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3.1 Vector fields and their flows 


We do not assume familiarity with differential geometry and analysis on manifolds, 

so in what follows one may assume that M = R* for some k. However, whenever 

possible we will phrase definitions and results in such a way that their more general 

meaning should be clear to those who are familiar with differential geometry etc. 
An old-fashioned vector field on X = R* is a map 


E: R* > R*, (3.3) 
EQ) =e Gye), (3.4) 


which describes something like a hyper-arrow at x. However, this is a coordinate- 
dependent object, which is hard to generalize to arbitrary manifolds. Therefore, in a 
modern approach a vector field is seen as the corresponding first-order differential 
operator € : C*(X) + C*(X) defined by 


k 
Efe) =P ola)aF- (3.5) 


To make the idea precise that a vector field on X is essentially the same as a first- 
order differential operator on C”(X), we note that it easily follows from (3.5) that 


S(fs) =S(f)s+f5(s), (3.6) 
for any f,g € C*(X), where the product fg is defined pointwise, i-e., 
(fs)(x) = f(x)g(x). (3.7) 


Similarly, we have pointwise addition and scalar multiplication, i.e., for s,t € R, 


(sf +19)(x) = sf(x) +tg(x). (3.8) 


This turns C*(X) into a commutative algebra (over R, as C°(X) =C”(X,R). 
A derivation of an algebra A (over R) is a linear map 6 : A > A satisfying 


5(ab) = 5(a)b +a5(b). 3.9) 


Thus any vector field on X defines a derivation of the algebra C”(X) by (3.5). Con- 
versely, a deep theorem of differential geometry states that for any manifold X, each 
derivation of C*(X) takes the form (3.5), at least locally (and for X = Ré also glob- 
ally). Therefore, either as a definition or as a theorem, we often simply identify 
vector fields on X with derivations of C*(X ). Derivations have a rich structure: 


Definition 3.4. A (real) Lie algebra is a (real) vector space equipped with a bilinear 
map [:,-]: Ax A> A that satisfies [a,b] = —[b,a] (and hence |a,a] = 0) as well as 


la, [b, c]] + [c, [a, b]] + [b, [c,a]] = 0 (Jacobi identity). (3.10) 
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It is easy to see that the set Vec(X) of all old-fashioned vector fields € on X (ie. 
in the sense (3.5)) forms a real Lie algebra under pointwise vector space operations 
(i.e., (s§ +1n)(f) =s§f +t1f) and the natural bracket 


[5.n]=6n—né. (3.11) 


Similarly, the set Der(A) of all derivations on some algebra is a Lie algebra under 
pointwise vector space operations and Lie bracket 


[51,52] = 6) 0 & — & 0d). (3.12) 


Of course, the identification of Vec(X ) with Der(C*(X )) identifies (3.11) and (3.12). 

Vector fields (or, equivalently, derivations) may be “integrated’’, at least locally, 
in the following sense. First, a curve through xo € X is a smooth map c: 1 > X, 
where J C R is open and c(to) = xo for some to € J. We usually assume that 0 € J 
with fo = 0 and hence c(0) = xo. We then say that c integrates € near xo if 


é(t) = (c(t), (3.13) 
a somewhat symbolic equality that can be interpreted in two equivalent ways: 
e Describing c: J R* by k functions c/ : 7 + R (j = 1,...,k), eq. (3.13) denotes 


dci(t) 


=P fal k ea 
a = El(e'(t),...,c*(t)), j=l,...,k. (3.14) 


e More abstractly, eq. (3.13) means that for any f € C*(X) we have 


d 
Sf (elt) = Felt). (3.15) 
To pass from (3.15) to (3.14), we just have to recall (3.5), and note that 
d aaa kay — we el(t) AF (c(t)) 
Tt) = Fle se) = d rans 


The theory of ordinary differential equations shows that such local integral curves 
exist near any point x9 € X, and that they are unique in the following sense: if two 
curves c; : J; > X and cp: + X both satisfy (3.13) with c;(0) = co(0) = xo, then 
C1 = C2 on 1, Nb. However, curves that integrate € near some point may not be 
defined for all t, i.e., for / = R. This makes the concept of a flow of a vector field &, 
which is meant to encapsulate all integral curves of €, a bit complicated. We start 
with the simplest case. We say that a vector field € is complete if for any x9 € X 
there is a curve c: R > X satisfying (3.13) with c(0) = xo. The simplest example 
of a complete vector field is X = R and € = d/dx, so that @,(x) =x++¢. For an 
incomplete example, take X = R and &(x) = x*d/dx. It can be shown that a vector 
field € with compact support (in the sense that the set {x € X | (x) 40} is bounded) 
is complete. In particular, any vector field on a compact manifold is complete. 
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Definition 3.5. Let X be a manifold and let € € Vec(X) be a complete vector field. 
A flow of & is a smooth map 9: Rx X > X, written 


0; (x) = O(t,x), (3.17) 

that satisfies 
Po(x) =x (3.18) 
Ps Pr = Pstt, (3.19) 


and that integrates & is the sense that for eacht € Randx € X, 


5 (o1(s)) = Lara) 6.20) 


t 


As before, eq. (3.20) by definition means that for each f € C°(X) we have 


Ef(O0)) = LH), G21) 
or, equivalently, that in local coordinates, where 
G(X) = ( (*),--- OF); (3.22) 
we have 
a = E/(@,(x)), j= 1,.. 5k. (3.23) 


Indeed, the flow @ of & gives the integral curve c of € through xo by 
c(t) = (xo). (3.24) 


According to the Picard—Lindeléf Theorem in the theory of ordinary differential 
equations, any complete vector field has a unique flow. In fact, the uniqueness part 
of this theorem implies that (3.19) is a consequence of (3.20) with (3.18), but it 
is convenient to state (3.19) separately, so as to make the point that the flow of a 
complete vector field € on X is a smooth R-action on X, as defined by conditions 
(3.18) - (3.19), whose orbits integrate &. In particular, each @, : X — X is invertible, 
with inverse g;! = @_,. In particular, X is a disjoint union of the integral curves of 
&, which can never cross each other because of the uniqueness of the solution of the 
initial-value problem (3.13) with c(0) = x0). 

If € is not complete, we do the best we can by defining the set 


De = {(t,x) ERX X | de:1 > X,c(0) =x,t eI} CRxX, (3.25) 


where it is understood that c satisfies (3.13). Obviously {0} x X Cc Deg, and (less 
trivially) it turns out that Dg is open. Then a flow of € isamap @: Dz — X that 
satisfies (3.18) for all x, eq. (3.21) for (t,x) € De, as well as (3.19) whenever defined. 
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3.2 Poisson brackets and Hamiltonian vector fields 


To obtain flows, classical mechanics requires more than a manifold structure: 
Definition 3.6. A Poisson bracket on a manifold X is a Lie bracket {—,—} on (the 
real vector space) C*(X), such that for each h € C*(X) the map 


fn: ft {hf} (3.26) 


is a vector field on X (or, equivalently, a derivation of C”(X,R) with respect to 
its structure of a commutative algebra under pointwise multiplication). A manifold 
X equipped with a Poisson bracket is called a Poisson manifold, (C*(X), { , }) is 
called a Poisson algebra, and &), is called the Hamiltonian vector field of h. 


Unfolding, we have a bilinear map {—, —} : C*(X) x C*(X) > C*(X) that satisfies 


{g,f} = {fg}: (3.27) 
tf {g,h}} ate {h, {f, gh} +1{g,{h, fh} =0; (3.28) 
{f,gh} = {fh g}atetfh}. (3.29) 


Bilinearity and the abstract properties (3.27) - (3.29) imply: 
Proposition 3.7. Each Poisson bracket on X defines a Lie algebra homomorphism 


C”(X) — Der(C”(X)); (3.30) 
h++ on, (3.31) 


or, equivalently, a Lie algebra homomorphism 


C*(X) > Vec(X); (3.32) 
hey Ey. (3.33) 


The time-honored example is X = R”, with coordinates x = (p,q) and bracket 


_y (of 98 of 7) 
wees bi (Graw dq! Op; ) 


(3.34) 


In that case, the Hamiltonian vector field of h is obviously given by 


"(dh a dh a 
25 : (3.35) 
= X & dq! dq! 


The flow of &) gives the motion of a system with Hamiltonian h. Writing 


9:(p,4) = (p(t), a(t), 


we see from (3.23) that this flow is given by Hamilton’s equations 
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dpj(t) Ph(p(t),g(t)) 


oS qi : (3.36) 
dqi(t) _ Ah(p(t), g(t) 
3.37 
Hamiltonians of the special form 
pr 
h(p.q) = > +V(q), (3.38) 
m 
where p* = De Pi give Newton’s equation “F = ma”, where F; = —0V /dq/, viz. 
d*qi(t 
Fy(g(t)) =m. 6.39) 


Proposition 3.8. For any vector field & on a manifold X, we say that a function 
f © C*(X) is conserved if f is constant along the flow of &. If X is a Poisson 
manifold and & = &) is Hamiltonian, then f is conserved iff {h, f} =0. 


The proof is trivial. A Poisson bracket on X may also be defined in terms of a Pois- 
son tensor. In coordinates, this is just an anti-symmetric matrix B’/ (x) that satisfies 


OB BK OBi 

Bt + Bl + Bik =0 3.40 
py ( Ox) Ox) Ox) : ( ) 

for each (i, j,k). In terms of B, the Poisson bracket is then defined abstractly by 
{f,8} = B(df,dg), (3.41) 

using standard notation of differential geometry, or, in coordinates, by 
x) Ag(x) 

Bi (x ; 42 
{f.8}(x) “2h oss °F (3.42) 


Conversely, a Poisson bracket must come from a Poisson tensor: for any derivation 
6 on C*(X), the function 6(g) depends linearly on dg, so if d¢(g) = {f,g}, then 
o¢(g) = —Se(f), so that {f,g} depends linearly on both df and dg. This enforces 
(3.42), upon which (3.41) implies (3.40). A nice example is X = R?3, with 


eee Ofdg oOfodg\ _ (dfodg ofodg iz Ofdg oOfodg\. 
TUS ay ae: az Oe ke ozs arc \OnOy aon” 
Bi(x = Dew (3.43) 


Finally, we say that a Poisson manifold is symplectic if the corresponding Poisson 
tensor B(x) is given by an invertible matrix, for each x € X. This requires X to be 
even-dimensional. For example, R~” with Poisson bracket (3.34) is symplectic. 
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3.3 Symmetries of Poisson manifolds 


Two equivalent notions of symmetries of classical physics suggest themselves: one 
is based on the idea of a Poisson manifold (X ,B), the other comes from the equiva- 
lent notion of a Poisson algebra (C*(X),{, }). 


Definition 3.9. 7. A symmetry of a Poisson manifold (X,B) is a diffeomorphism 
@ :X — X (that is, an invertible smooth map with smooth inverse) satisfying 


9,B = B. (3.44) 


2. A symmetry of a Poisson algebra (C*(X),{, }) is an invertible linear map 
a:C*(X) + C*(X) that satisfies (for each f,g € C*(X)): 


a(fg) = a(f)a(g); (3.45) 
a({f,g}) = {a(f),a(g)}. (3.46) 


Let us define the push-forward @, in (3.44). We do this in terms of the pullback o* 
of a smooth (i.e., infinitely often differentiable) map 9 : X — X, defined as 


0" :C°(X) 3 C*(X); (3.47) 
pf = fo®@. (3.48) 
If @ is a diffeomorphism, the push-forward @,. of @, which acts on derivations, is 


@, : Der(C”(X)) + Der(C”(X)); (3.49) 
(P-5)(f) = S(P* fog 's (3.50) 


this may be checked to define a derivation, as follows: 


(P.5) (f° as 


If, given coordinates x = (x!,...,x*) on X, we now (without loss of general- 
ity) take our derivation 6 to be a vector field € = )),¢/0/dx/, and write p(x) = 


(o!(x),...,@!(x)), for the image @,(&) we obtain 
(8) (F(a) = (E(9*/))(@ '()) 
= Leg") a (o""(%)) 


yal ) ag! 
~ Lae 5 ae), 
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so that ; 
98) =D (oeNEO"), G51) 
k 
or, equivalently, 
0.!(0(0) =¥ SEM), (3.52) 


which only depends on & (x), so that for each x € X, @, may be localized to a linear 
map @,(x) : TX > T9(x)X. This may be done even if @ is not invertible. Physicists 
often write this as p(x) = y = y(x!,...,x*), € =v, @.E =v’, so that we have a 
“covariant” transformation rule (v’)!(y) = nae YO) yi (x). 

Taking tensor products, one obtains similar rules for higher-order tensors. For 
example, if N = X, the transformation rule for the Poisson tensor B reads 


k ify I(x 
o.Bi(ol) = yy eee Bm) oe 


myn=1 


so that, in coordinates, the invariance requirement (3.44) reads 


on gn B(x) = BY (GC). (3.54) 


Theorem 3.10. The two parts of Definition 3.9 are equivalent, in that: 


1. Given a diffeomorphism ~ : X — X satisfying (3.44), the map 
a=", (3.55) 


i.e, a(f) = fo@, is linear, invertible, and satisfies (3.45) - (3.46). 

2. Given an invertible linear map a: C*(X) — C*(X) that satisfies (3.45) - (3.46), 
there is a unique diffeomorphism @ : X — X inducing a as in (3.55). 

3. This correspondence defines an anti-isomorphism between the group Diff(X ,B) 
of diffeomorphisms of X satisfying (3.44) and the group Aut(C”(X),{, }) of in- 
vertible linear maps a : C*(X) + C®(X) that satisfy (3.45) - (3.46). 


Here an anti-isomorphism of groups is just an isomorphism that inverts the order of 
multiplication. This complication may be removed by writing g~! instead of @ in 
(3.55), but that change would make the next proposition a bit less natural. 


Proof. The first claim is true by construction. The hard part is the second claim, 
which follows from a more general result about manifolds (note that in our termi- 
nology, manifolds are by definition assumed to be Hausdorff): 


Proposition 3.11. Let X and Y be a smooth manifolds. Then (3.55) establishes a 
bijective correspondence between linear maps a: C”(X) + C”(Y) satisfying (3.45) 
and smooth maps @:Y — X. 
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The proof is quite similar to a central part of the proof of Gelfand duality for commu- 
tative C*-algebras, in which (3.55) establishes a bijective correspondence between 
C*-homomorphisms @ : C(X) + C(Y) and continuous maps @ : Y — X, where X 
and Y are compact Hausdorff spaces; see §C.3 and especially Proposition C.22. 

For any commutative real algebra A, let Y(A) be the space of non-zero algebra 
homomorphisms @ : A — R (these are just the non-zero multiplicative linear maps), 
equipped with the weakest topology that makes each function 4 : (A) > R contin- 
uous, where @(@) = @(a). Furthermore, if B is another commutative real algebra, 
then any homomorphism @ : A — B induces a continuous map a@* : Y(B) > L(A) 
in the obvious way, that is, by a*@ = @oa. In the special case A = C*(X) (and 
similarly if A = C(X)), one has a canonical map ev* : X — L(C(X)), given by 
ev* (f) = f(x). The whole point (in which the entire difficulty of the proof lies) 
is that this map is a bijection (see Proposition C.21), which simultaneously equips 
X with a smooth structure that makes ev* a diffeomorphism (by definition of the 
smooth structure on ©(C(X)). In view of all this, given a multiplicative linear map 
a:C”(X) > C*”(Y), we obtain a continuous map @ : Y > X by 


go =(ev’)loa*oev*. (3.56) 


Eq. (3.55) then holds by construction. Smoothness of @, then, is a consequence of 
the fact that a(f) = fo @ must be a smooth function on Y for any f € C”(X). 
Applying this to the setting of Theorem 3.10 easily yields all claims. 


In what follows, we look at smooth actions of Lie groups on (Poisson) manifolds 
X, in other words, at homomorphisms @ : G > Diff(X) or 9 : G— Diff(X, B), where 
G is a Lie group, Diff(X) is the group of all diffeomorphisms of a manifold, and 
Diff(X,B) is the group of all diffeomorphisms of a Poisson manifold preserving 
the Poisson structure. Foregoing the underlying differential geometry, we take a 
pragmatic attitude and only study linear Lie groups, defined as closed subgroups G 
of GL,(R) or GL,(C), with group multiplication given by matrix multiplication and 
hence group inverse being matrix inverse. Here one may think of SU(2) C GLy(C) 
or SO(3) C GL3(R), but also abelian Lie groups like the additive groups R” fall 
under this scope, since one may identify a € R” with the 2n x 2n-matrix 


a= (5%): (3.57) 


in which case matrix multiplication indeed reproduces addition. Similarly, the 2n + 
1-dimensional Heisenberg group H,, is the group of real (n +2) x (n+2)-matrices 


la ct la'b 


(a,b,c) =| 01, b , (3.58) 
00 1 


where a,b € R”, c € R, and a’ b = (a,b); this gives the multiplication rule 


(a,b,c): (a’,b',c') = (at+a',b+b',c+c'—}((a,b’) —(a’,b))). (3.59) 
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If G is a linear Lie group, its Lie algebra g may be defined as the vector space 
g={AEM,(K) | 4 € GVt ER}, (3.60) 


where K = R or C, as determined by the embedding G C GL,(R)) or G C GL, (C). 
Either way, g is seen as a real vector space, equipped with the Lie bracket 


[A,B] = AB-— BA. (3.61) 
This is trivially a bilinear antisymmetric map g x g — g satisfying the Jacobi identity 
[A, [B,C] + [C, [A, B]] + [B, [C, A] =0, (3.62) 
which in turn expresses the fact that for fixed A € g the map 6, : g — g defined by 
64(B) = [A,B] (3.63) 

is a derivation of g with respect to its Lie bracket, i.e., 
5, ([B,C]) = [84(B),C] + [B, &4(C)]. (3.64) 


The exponential map exp : g > Gis then just given by its usual power series, which 
for matrices is norm-convergent. Conversely, one may pass from G to g through 


d 
= FO eo: (3.65) 
If G = R", we also have g = R”, and eq. (3.57) implies that exp is the identity map. 
For example, since SO(3) is the subgroup of GL3(R) consisting of matrices R 
that satisfy R’R = 13, its Lie algebra s0(3) consists of all matrices a that satisfy 


a’ = ~—a. Asa vector space have s0(3) & R°, which follows by choosing a basis 
00 0 001 0-10 
=| 00-1],4=] 0 00)],4=]10 0]. (3.66) 
01 0 —100 000 


of the 3 x 3 real antisymmeric matrices. The commutators of these elements are 
Vi ,J2] =J3; (3,Ji] = Ja; 2,3] =J1. (3.67) 


For the Lie algebra of the Heisenberg group we obtain h, = R?”+!, with basis 


00 0 0 e; 0 001 
P,=|00-e |],0;=|000],Z2= {| 000], (3.68) 
00 0 000 000 
where (e€1,...,€,) is the usual basis of R”, satisfying commutation relations 


[Pi, Qj] = 6jZ; [P,Pj] =[Qi,Q;] = [Fi,Z] = [Q;,Z] =0. (3.69) 
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3.4 The momentum map 


Leaving out the Poisson structure for the moment, let X be a manifold, let G be a 
Lie group, and let @ : G + Diff(X) be a homomorphism; as already mentioned, this 
corresponds to a smooth action @ : G x X — X, which we simply write as 


YX = Oy(x) = G(Y,x). 


In terms of the pullback @, (f) = f° @y, we then automatically have 


9, (£8) = P,(f),(8)- (3.70) 


For each A € g we then define a map d4 : C*(X) + C”(X) by 


bafle) = Fle peo. G.11) 


This map is obviously linear. Moreover, it can be shown that 6 is well behaved: 


Proposition 3.12. The map 6 : g > Der(C*(X)), A +> 64 is a homomorphism of Lie 
algebra, i.e., each 6, is a derivation, 6 is linear in A, and, for each A,B € g, 


[54,58] = 514.8). (3.72) 


The proof relies on Hadamard’s Lemma, which we only need for complete vector 
fields, or, equivalently, for derivations with complete flow (i.e., defined for all r). 


Lemma 3.13. [f 6 is a derivation of C*(X) with complete flow @, and f € C*(X), 
then there is a function g(t,x) = g;(x) such that for all x and t, 


go(x) = 6f (x); (3.73) 
f(@(x)) = f(x) +tg(x). (3.74) 


Indeed, if the flow is complete one may take 


1 
81 (x) =} dsF (st,x), (3.75) 


where F (t,x) = f(@;(x)) and (in Newton’s notation) F is the time derivative of F. 
Proof. To prove that 6, is linear in A, let @ be the flow of 64, i.e., Q(x) = ex. 


For B € g, Hadamard’s Lemma with 6 ~» 64 and x ~» e~'®x then gives us 


flee Bx) = f(pi(e“*x)) = flex) +18) (ex); 
d 


= Glee a) i-0 = a f(x) + 80(*) = Safle) + 54flz). 3.76) 


On the other hand, since A and B are matrices, we may use the CBH-formula 
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pth 1B pA pAB]+O(?) | (3.77) 


which gives ee? = e(4+4) (1 + O(r?)), and hence 


Flee a9 = 5 ne (A+B) x) 9 = Sate f(z). (3.78) 


dt 
Comparing (3.76) with (3.78) gives 6448 = 64 + 6g. The property 6,4 = sds is triv- 
ial. We now prove (3.72). Within the (matrix) Lie algebra g we have 
d —tAB tA B 
[A,B] = — 5 (e Bel) 9 = —lim ae (3.79) 


t0 t 


0, o—l _— . . . 
Furthermore, for any g € G one has e888 = ge®g~!, so linearity of 5 gives 


1 
Oia.aif(x) = —lim — (5,-ageaf (a) — Sa f(x) 


t>0 
= lim uy (Srletetets) _ ares) 
AY 


1 
= lim — (fle MeBelx) — fle “ele x)) 
st0 st 


= lim © (fog (eBel4x) — f0@,(e4ex)) 


s,t0 st 
= jim, (2 (f(eBelx) — f (ee x) +4 : (g: (eB elAx) 8; (ees) 
= [64, dz] f(x), 


since in the limit t — 0 the third term in the penultimate line cancels the fourth. 


Now suppose that, in addition, X is a Poisson manifold, and that each @y acts on 
X as a Poisson symmetry, in that 


0; B=B, (3.80) 


cf. (3.44), or, equivalently, cf. (3.46), 


Py (LF.8h) = {Py (FS), Py(g)F- (3.81) 
This implies, for each A € g, and each f,g € C*(X), 
da({f.8h) = {5a(f),8} + {Ff da(8)}- (3.82) 
Compare this with the following property 64 already has since it is a derivation: 
5a(fg) = Oa(f)g + fOa(8)- (3.83) 


We may call a derivation 6 : C*(X) — C*(X) satisfying the like of (3.82), ie., 
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5(if,8}) ={8(f), st +{F,5(8)}, (3.84) 


a Poisson derivation. We are already familiar with a large class of Poisson deriva- 
tions: for each h € C*(X), the corresponding map 6, defined by (3.26) is a Poisson 
derivation (this follows from the Jacobi identity). Let us call a Poisson derivation of 
the kind 6), inner. This raises the question if our derivations 6, are inner. 


Definition 3.14. A momentum map for a Lie group G acting on a Poisson manifold 
X is amap 


J:X 3g" (3.85) 


such that for each A € g, 
64 = 6), (3.86) 


where the function J, € C*(X) is defined by by 
Jax) = (I(2),A) = J(0)(A). 3.87) 
In other words, for each A € g and f € C”(X) we must have 


6a(f) = {Jas F}- (3.88) 
A Lie group action admitting a momentum map is called Hamiltonian. 


Equivalently, a momentum map is a linear map 
J*:g3C*(X) (3.89) 
such that 64 = 674); the connection between the two definitions is given by 
Ja =J"(A). (3.90) 


The pullback notation J* would suggest that it is a map C*(g*) — C*(X), which is 
not quite the case, but it is a near miss: we embed g <> C”(g*) by At> A, where 
A(@) = @(A), so J* : g + C*(X) is the restriction of the pullback J* to g. Another 
near miss would be to read J* as the adjoint to J, which maps g** = g to the ‘dual’ 
X*, but since X may not be a vector space, this dual cannot be defined as in linear 
algebra, so instead of all linear maps from X to R we might as well say that it 
consists of all smooth functions on X. Either way, the symbol J* seems justified. 


Proposition 3.15. Let G be a connected Lie group that acts on a Poisson manifold 
X. If this action is Hamiltonian (i.e., if it has a momentum map), then G acts on 
(X,B) by Poisson symmetries (in the sense that (3.81) holds). 


Proof. An easy computation shows that (3.82) holds. We omit the proof of the fact 
that for connected Lie groups this “infinitesimal” property is equivalent to (3.81); 
this relies on the fact that G is generated by the image of the exponential map. 


The converse is not true: if G acts by Poisson symmetries, the action is not neces- 
sarily Hamiltonian. For example, take X = R?, with the unusual Poisson bracket 
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__f(oafog of dg 
(fea) =n (So38 - S58), (3.91) 


and let G = R act on R? by b- (p,q) = (p,q +b). This action satisfies (3.81), and 

has a single generator 6 = —d/0q. But there clearly is no function J € C*(R*) such 

that {J, f} = —df/dq (it should be J(p,q) = —log(p), which is singular at p = 0). 
However, in most “everyday situations” momentum maps exist: 


1. Take X = R®° = R? x R’, with coordinates x = (p,q), where p = (p1, p2, p2) and 
q = (q',¢°,q@°), equipped with the canonical Poisson bracket (3.34). 


a. Let G=R° act on X by 
(a,b): (p,q) = (p+a,q+b). (3.92) 


This action is Hamiltonian, with momentum map 


J(p,4) = (4,—P). (3.93) 
b. Let G = SO(3) act on the same space X by 
R- (p,q) = (Rp, Raq). (3.94) 
Also this action is Hamiltonian, with momentum map 


J(p,q) =p xq. (3.95) 


2. Let G=SO(3) act on X = R?, equipped with the Poisson bracket (3.43), through 
its defining representation. This action has a momentum map 


J(x) =x, (3.96) 


where we have identified g with R* by choosing the basis (3.66) of g, and have 
identified g* with g (and hence with R°? also) by the usual inner product on R?. 

3. The previous example is a special case of the Lie—Poisson structure. Let G be a 
Lie group with Lie algebra g. Choose a basis (T,) of g, with associated structure 
constants C’,, defined by the Lie bracket on g as 


[tat = ¥ CGT (3.97) 


We write @ in the dual vector space g* as 0 = )\, 0,@%, where (@,) is the dual 
basis to a chosen basis (T,,) of g, i.c., q(T) = ap. In terms of these coordinates, 
the Lie—Poisson bracket on C*(g*) is defined by 


_ ne g OF(8) Ag(9) 
{f,g}(@) = apOe—F6 ae (3.98) 


Equivalently, the Poisson bracket (3.98) may be defined by the condition 
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{A,B} = (A,B), (3.99) 

where A,B € g and A € C*(g*) is the evaluation map A(@) = 0(A). 
Now G canonically acts on g* through the coadjoint representation, defined by 


(x-@)(A) = (x !Ax). (3.100) 


This action is Hamiltonian with respect to the Lie—Poisson bracket (3.98), the 
associated momentum map simply being the identity map g* — g’*, as in (3.96). 
In other words, we have 


A 


J4 =A, (3.101) 


whose correctness may be verified from the computation 


. Let X = T*Q for some manifold Q. e.g. Q = R” and hence X = R*”. We take 
G = Diff(Q), (3.102) 


i.e., the diffeomorphism group of Q. This is an infinite-dimensional Lie group (if 
described in the right way). The defining action of g € G on Q induces an action 
called g* on T*Q, given (in coordinates) by 


'(p.4) = (P',4'); (3.103) 

(q')' = 9'(q); (3.104) 
no S1sy 

p=) eo @ (3.105) 


This may be taken as a definition, but in the language of differential geometry 
this comes down to the neater prescription that if @ = )'; pjdq’ € TQ, then 


9° 8 € Ty ,)Q is the one-form that maps a vector X € Tg) Q to 6(g,1(X)), ie., 


(p*8)(X) = O(¢, '(X)), (3.106) 
where 9, !(X) =; g,'(X)/0/dq! is given componentwise by, cf. (3.52), 
ly 
pcx) =p 20D) ye aie 
J 


If Q = R? and g = R € SO(3), then, using R~'! = R’, we find that (3.104) - 
(3.105) simply become R* (p,q) = (Rp,Rq), as in (3.94). 
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Furthermore, if @(q) = q+), then the partial derivatives in (3.105) form the 
identity matrix, so that p*(p,q) = (p,q +b). To show that the action of Diff(Q) 
on 7*Q is Hamiltonian and compute its momentum map, we need to know that 
the Lie algebra of Diff(Q) is the space Vec(X) of all vector fields on Q, with 
its canonical Lie bracket (3.61)! We will not prove this, but the exponential map 
exp : g > Gis given through the flow @ of the vector field € on Q by (cf. (3.20)) 


eo =q,. (3.108) 
Theorem 3.16. The action of Diff(Q) on T*Q has momentum map 


Jx (Pq) = — YEP iX"(q), (3.109) 
J 


and hence is Hamiltonian. Moreover, this momentum map satisfies 
{Je Inte = Sle n)- (3.110) 


Proof. First note that 9, | = @_;, so from (3.71), (3.108), and (3.104) - (3.105), 


d 
5: f(P,q) = aid (9-1(PD))je=0 


From this and (3.109), using the canonical Poisson bracket (3.34) we find 


{Jef} = bef. 


Finally, verifying (3.110) is a simple exercise. 


Thus the momentum map is a generalization of (minus) the momentum, whence 
its name; the quantity in (3.95) is (minus) the angular momentum. These annoying 
minus signs could be removed by putting a minus sign in (3.86), but that would have 
other negative (sic) consequences. For example, with our sign choice one often has 


{Ja,JB} =Jiap), (3.111) 


in which case the accompanying map (3.89) is a homomorphism of Lie algebras, 
or, equivalently, J is a morphism with respect to the given Poisson bracket on X 
and the Lie—Poisson bracket on g*. Such a momentum map is called infinitesimally 
equivariant, for if G is connected, (3.111) is equivalent to the equivariance property 


J(g:x) = g-J(x). (3.112) 
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Here the G-action on g* on the right-hand side is the coadjoint representation. 

All of this is true for our examples (3.95), (3.96), (3.101), and (3.109); in the 
latter case we note that the Lie bracket in the Lie algebra of Diff(Q) is minus the 
commutator of vector fields. However, (3.111) does not always hold (in which case 
a fortiori also (3.112) fails). For example, it fails for (3.93): if we take the usual 
basis (e,f) = (e1,€2,e3, f1,f2,f3) of g = R° and relabel e; = Qj; and f; = —P,, then 


Jp,(P,q) = pis (3.113) 
Jo;(P,4) = 4j; (3.114) 


cf. (3.93), and hence, although [P;, Pj] = [Q;,Q;] = [P;,Q;] = 0, we obtain 


{Jp,,Jpj} = {Jo,,Jo;} =; (3.115) 
{Jp,,Jo;} = Sijlps. (3.116) 


Fortunately, in cases like that one can often find a central extension Gg of G (see 
§5.10 below for notation) that acts on X through its quotient group G and does have 
an infinitesimally equivariant momentum map. In the case at hand, the Heisenberg 
group H3 does the job, whose central elements (0,0,c) then act trivially on R®°. In 
terms of the generators (3.68) we take Jp, and Jo; as in (3.113) - (3.114), and add 
Jz = po; according to (3.69) and (3.115) - (3.116) we then have (3.111), as desired. 
Finally, the above formalism leads to a clean formulation of Noether’s Theorem, 
providing the well-known link between symmetries and conserved quantities: 


Theorem 3.17. Let X be a Poisson action equipped with a Hamiltonian action of 
some Lie group G (so that there is amomentum map J : X — g*). Suppose hh € C*(X) 
is G-invariant, in that h(y-x) = h(x) for each y € G and x € X. Then for each A € g, 
the function J, is constant along the flow of the vector field Xp. In other words, 


Ja(@r(x)) = Ja(x) (3.117) 
for any x © X and any t € R for which the flow @;(x) of Xj, is defined. 


Proof. Using all assumptions as well as the definition of a flow, we compute: 


“Ia e(x)) = Xn(Ja) (r(x) = na) (r(x) 
= th, Ja}(@(x)) = —{Ja, A} (G(x) 


= -84(1)(1(0)) = “We O10))s-0 


= © W(9,(2))j0 =0. 


For example, a Hamiltonian (3.38) has conserved (angular) momentum if the poten- 
tial V is translation (rotation) invariant, reflecting (3.93) and (3.95), respectively. 
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Notes 


The traditional symplectic approach to classical mechanics, culminating in the mo- 
mentum map, is exhaustively covered in Guillemin & Sternberg (1984) and Abra- 
ham & Marsden (1985). A founding paper for Poisson geometry is Weinstein 
(1983). The modern Poisson approach to mechanics may be found in Marsden & 
Ratiu (1994), from which most of the material in this chapter originates. 

Our proof of Proposition 3.11 is based on Navarro Gonzalez & Sancho de Salas 
(2003), §2.1. Burtscher (2009) is a nice survey of many similar results. 


Chapter 4 
Quantum physics on a general Hilbert space 


In this chapter we generalize the results of Chapter 2 to infinite-dimensional Hilbert 
spaces. So let H be a Hilbert space and let B(H) be the set of all bounded op- 
erators on H. Here a notable point is that linear operators on finite-dimensional 
Hilbert spaces are automatically bounded, whereas in general they are not. Thus we 
impose boundedness as an extra requirement, beyond linearity. This is very con- 
venient, because as in the finite-dimensional case, B(H) is a C*-algebra, cf. §C.1. 
At the same time, assuming boundedness involves no loss of generality whatsoever, 
since we can alway replace closed unbounded operators by bounded ones through 
the bounded transform, as explained in 8B.21. Nonetheless, even the relatively easy 
setting of bounded operators leads to some technical complications we have to deal 
with. First, Definition 2.1 must be adjusted as follows: 


Definition 4.1. Let H be a Hilbert space. 


J. A (quantum) event is a closed linear subspace L of H. 

2. A density operator is a positive trace-class operator p on H such that Tr(p) = 1; 
we continue to denote the set of all density operators on H by D(A). 

3. A (quantum) random variable is a bounded self-adjoint operator on H. 

4. The spectrum 0 (a) of a bounded operator a is the set of all A € C for which the 
operator a — A is not invertible in B(H) (cf. Definition B.80). 


As shown in Corollary B.88, if H is finite-dimensional this notion of a spectrum 
reduces to the set of eigenvalues of a. Even H is infinite-dimensional, the spectrum 
of a self-adjoint operator a is real (i.e., o(a) C R); this is also true if a is unbounded 
(see Theorem B.93). For any H, unit vectors y still define special density matrices 
€y, as in (2.7); we will later see that these are pure states on B(#), although the 
set of pure states is no longer exhausted by such density matrices. Finally, quantum 
events in H still bijectively correspond with projections on H; see Proposition B.76. 
The Born rule as well as the correspondence between density matrices and states 
require a separate discussion, to which we now turn. 
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4.1 The Born rule from Bohrification (11) 


In this section we extend the characterization of the Born rule in 82.5, which was 
restricted to finite phase spaces X and finite-dimensional Hilbert spaces H, to the 
general case. Recall that a probability space is a measure space (X,2,) for which 
L(X) = 1, and that, for compact X, a state on C(X) is a positive map 9: C(X) > C 
that is positive and satisfies @(1y) = 1. Theorem B.15 and Corollary (B.17) yield: 


Theorem 4.2. Let X be a compact Hausdorff space. There is a bijective correspon- 
dence between probability measures & on X and states @ on C(X), given by 


= f auf, fect), (4.1) 


More precisely, the correspondence in question is between complete regular proba- 
bility spaces (X, 2,1) and states on C(X), and this is understood in what follows. 
Second, we recall that if H is a Hilbert space and a € B(H), then C*(a) is the 
C*-algebra generated by a and 1, (i.e., the norm-closure of the algebra of all poly- 
nomials in a). Theorems B.84, B.94, and B.93 give the following spectral theorem: 


Theorem 4.3. [f a* = a € B(H), then C*(a) is commutative, o(a) C R is compact, 
and there is an isomorphism of (commutative) C*-algebras 


C(o(a)) =C*(a), (4.2) 


written f ++ f(a), which is unique if it is subject to the following conditions: 


1. the unit function 1 g(a) : A + 1 corresponds to the unit operator 14; 
2. the i eS. ide : A+ A is mapped to the given operator a. 


Furthermore, this ee functional calculus satisfies the rules 


(tf +8)(a) =tf(a) +8(a); (4.3) 
(fs)(a) = f(a)s(a); (4.4) 
f(a)" = f° @). (4.5) 


Combining Theorems 4.2 and 4.3 gives a result of great importance: 


Corollary 4.4. Let H be a Hilbert space, let a* =a € B(H), and let y € H be a unit 
vector. There exists a unique probability measure [Ly on the spectrum O(a) such that 


(y, f(a) W=f, duyf, f €C(o(a)). (4.6) 


In terms of the spectral projections e, = 1,(a) (defined for Borel sets A C o(a)) 
constructed in (B.305) - (B.307) and Theorem B.102, the Born measure is given by 


Ly(A) = lleay||’. (4.7) 
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More generally, a density operator p © Y(H) induces a unique probability measure 
Lp on O(a) for which 


Tr(pfla) =f dito. FECL). (48) 
Mp(A) = Tr (pea). (4.9) 
This measure on O(a) is called the Born measure (defined by a and YW or p). 


Proof. The point is that the map f + (yw, f(a) W) defines a state on C(o(a)): 


e Linearity follows from linearity of the continuous functional calculus f +> f(a); 
e Positivity follows because if f > 0, then f = \/f-/f, so that by (4.4) and (4.5), 


(w.f(a)V) = IlVF(@ yl? = 0: 
e Unitality follows from Theorem 4.3.1, i.e., (YW, low@y(a)W) = (Way) =1. 


To prove (4.7), use Lemma B.97 to approximate 1, by functions f,, € C(o(a)) as 

stated. By Theorem B.13.2 (i.e., the Lebesgue Monotone Convergence Theorem), 

we have J(q) dy fn Jo(a) My La = Hy(A), whereas by (B.315) with an = fn(a), 

one has (Y, fr(a)W) > (W,ea VW) = |lea w||*. Hence (4.7) follows from (4.6). 
The proof for density operators is analogous. 


Defining the mean value (a) y of a with respect to the Born measure [ly by 
(ayy =f dy x)s. (4.10) 
o(a) 


and similarly for p, using Theorem 4.3.2 we easily obtain 


(ay = (W,ay); (4.11) 
(4)p = Tr(pa). (4.12) 


As an important special case, suppose that o(a) = 0,(a) (i.e., each A € O(a) is 
an eigenvalue); this always happens if H is finite-dimensional. Eq. (A.57) then gives 


Wfay= ) FQ) lea 


AEeol(a 
where e, is the projection onto the eigenspace H, = {yw €¢ H |aw = Ay}. Thus 
My(A) = lleayll, (4.13) 


and using the notation Py(a = A) for Ly(A), eq. (4.11) just becomes 


(a)y= ) A-Py(a=A). (4.14) 
AEo(a) 


It is customary to extend the Born measure on o(a) C R to a (probability) measure 
Ly on all of R by simply stipulating that 
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My (A) = Hy(ANo(a)); (4.15) 


we will often assume this and omit the prime. This obviously implies that Uy(A) = 0 
for any Borel set A C R disjoint from o(a); in particular, if o(a) is discrete, then 
[ly is concentrated on the eigenvalues A of a, in that 


My(A)= Yo py (A). (4.16) 
) 


AEANo(a 


To state an interesting property of the Born measure we need Hausdorff’s solu- 
tion to the relevant special case of the famous Hamburger Moment Problem: 


Theorem 4.5. Jf K C R is compact, then any finite measure LL on K is determined 
by its moments 


ay = | du(s)s". (4.17) 
K 
Using f(x) =x" in (4.6), we therefore obtain: 


Corollary 4.6. The Born measure My is determined by its moments 
On = (Wia"y). (4.18) 


More precisely, we need to be sure that numbers (@,) of the kind (4.18) are the 
moments of some (probability) measure. This follows from the spectral theorem by 
running the above argument backwards, but one may also use the general solution 
of the Hamburger Moment Problem, which we here state without proof: 


Theorem 4.7. A sequence of real numbers (Q,) forms the moments of some measure 
Lon R iff for all N EN and (Bi,...,By) € CX one has LN np BnBma"*” > 0. 
Furthermore, if there are constants C and D such that |a,| < CD"n!, then w is 
uniquely determined by its moments (On). 


These conditions are easily checked from (4.18). 


If a is unbounded, but still assumed to be self-adjoint (in the sense appropriate 
for unbounded operators, cf. Definition B.70), the spectrum o(a) remains real (see 
Theorem B.93) but it is no longer compact. Nonetheless, the Born measure on o(a) 
may be constructed in almost exactly the same way as in the bounded case, this time 
invoking Corollary B.21 and Theorem B.158 instead of Theorems 4.2 and B.94, 
respectively. Corollary 4.4 then holds almost verbatim for the unbounded case: 


Corollary 4.8. Let H be a Hilbert space, let a* =a, and let y € H be a unit vector. 
There exists a unique probability measure [ty on the spectrum O(a) such that 


w.flay) = [ pilivl ££ Co(o(@)) (4.19) 


Also, eqs. (4.7) and (4.9) hold, as does (4.8), with f © Co(o(a)). 
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There is no need to worry about domains, since even if a is unbounded, f(a) is 
bounded for f € C,(o(a)), and hence also for f € Co(o(a)). 
The physical relevance of the Born measure is given by the Born rule: 


If an observable a is measured in a state p, then the probability Py (a € A) that the 
outcome lies in A C R is given by the Born measure Up defined by a and p, i.e., 


Py (a € A) = Mp(A). (4.20) 


As in the finite-dimensional case, the Born measure may be generalized to fami- 
lies (a1,...,@,) of commuting self-adjoint operators. Assuming these are bounded, 
the C*-algebra C*(a1,...,a,) is defined in the obvious way, i.e., as the smallest C*- 
algebra containing each a;, or, equivalently, as the norm-closure of the algebra of all 
finite polynomials in the (a1,...,a,). This C*-algebra is commutative, as a simple 
approximation argument shows: polynomials in the a; obviously commute, and this 
property extends to the closure by continuity of multiplication. However, even in the 
bounded case, the correct notion of a joint spectrum is not obvious. In order to mo- 
tivate the following definition, it helps to recall Definition 1.4, Theorem C.24, and 
especially the last sentence before the proof of the latter, making the point that the 
spectrum o(a) of a single (bounded) self-adjoint operator coincides with the image 
of the Gelfand spectrum 2(C*(a)) in C under the map @ + @(a). 


Definition 4.9. 7. The joint spectrum o(a) = 0(a),...,d,) C R” ofa finite family 
a = (a1,...,dn) of commuting bounded self-adjoint operators is the image of the 
Gelfand spectrum £(C*(a1,...,4n)) = Z(C*(a)) under the map 


X(C*(a1,..-,dn)) +R", @+ (@(a1),...,0(an)). (4.21) 
Since @(a;) only utilizes the restriction of @ to C*(a;) C C*(a), we have @(aj) € 
o(a;) CR, so that X(C*(a)) C o(a1) x +++ x O(a,) is a compact subset of R". 
To justify this definition, we note that: 


e Forn= 1, this definition reproduces the usual spectrum, cf. Theorem C.24. 

e Forn> 1 and dim(H) <9, we recover the joint spectrum of Definition A.16. 

e For n> 1 and dim(H) =, Weyl’s Theorem B.91 generalizes in the obvious 
way: we have A € o(a) iff there exists a sequence (yj) of unit vectors in H with 


lim || (a; — Ai) %|| = 0, (4.22) 
k—-s00 
for each i = 1,...,n. The proof is similar. 


One way to see the second claim is to use Proposition C.14 joined with the ob- 
servation that, as in the case of A = B(H) for finite-dimensional H, any pure state 
on a finite-dimensional C*-algebra A C B(H) is a vector state (2.42), too. To see 
this, we first specialize Theorem C.133 to the finite-dimensional case (where the 
proof becomes elementary), so that each state on C*(a) takes the form (2.33). Sub- 
sequently, we use the spectral decomposition (2.6), and use the definition of purity: 
suppose @(b) = Tr(pb) = Y; pi(v;,bv;) = ¥; pi@y,(b) is pure, where b € C*(a). 
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Then Wy, = @ for each i, so that @ is a vector state, say @(b) = (Ww, bw) where yw is 
one of the v;. Once we know this, suppose A = (A1,...,An) € O(a), with A; = w(a;). 
Multiplicativity of @ implies that for any finite polynomial in n real variables we 
have (W,p(a)w) = p(A), which easily gives a;y = A;y for each i; for example, 
take p(x) = (x; —A;)*, so that the previous equation gives ||(a; — A;) w||? =0. 

Conversely, if A is a joint eigenvalue of a, then by definition there exists a joint 
eigenvector y whose vector state @(b) = (w, by) on C*(a) is multiplicative. 

Using this (perhaps contrived) notion of a joint spectrum, Theorem 2.19 now 
holds by construction also if dim(H) = ©, where the pertinent isomorphism f +> 
f(@) is given as in the single operator case, that is, by starting with polynomials and 
using a continuity argument to pass to arbitrary continuous functions. 

Theorem 2.18 and Corollary 4.4 then generalize to: 


Theorem 4.10. Let H be a Hilbert space, let a = (a,,...,an) be a finite family of 
commuting bounded self-adjoint operators, and let ' € H be a unit vector. There 
exists a unique probability measure [Ly on the joint spectrum o(a) such that 


wfay) =f duyf, fe C(ota)) (4.23) 
or, equivalently, for special Borel sets A = A, x -++ X A, C O(a), where A; C o(aj), 


My(A) = lea, ea, Wl’, (4.24) 
where the ea, = 1,,(a;) are the pertinent spectral projections (which commute). 


Similarly for density operators instead of pure states. 

If (some of) the operators a; are unbounded, we use the trick of §B.21 and pass 
to their bounded transforms b;, see Theorem B.152. We say that the b; commute iff 
the corresponding bounded operators b; do; this is equivalent to commutativity of 
all spectral projections of the a;. We then define, in self-explanatory notation, 


o(a) = {A(1—47)-!/? | A € o(b) N(-1,1)"}. (4.25) 


This leads to Born measures on o (a) defined either as in (4.23), with f € C(o(a)) 
replaced by f € Co(o(a)), cf. (4.19), or as in (4.24). 
For example, if H = L7(IR”) and a;y(x) = x;y(x), defined on the domain 


Dla) ={veL(R")| f atextlya)P <=}, (4.26) 


as in (B.242), then b;y(x) = x;(1+x?)~!/? w(x), so that o(b) = [—1, 1)” and hence 
o(a) = R". For a measurable region A C R” we then have Pauli’s famous formula 


wy(4) = f arxly()? (4.27) 


for finding the particle in the region A, given that the system is in a pure state y. 
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4.2 Density operators and normal states 


Definition 2.4 of a state still makes good sense in the infinite-dimensional case, as 
it simply specializes the general definition of a state on a C*-algebra A to the case 
A = B(H). Thus we continue to say that a state on B(H) is a complex-linear map 
@ : B(H) — C satisfying w(b*b) > 0 for each b € B(H) and @(17) = 1. Despite 
this lack of novelty in the definition of a state (i.e., compared to finite-dimensional 
Hilbert spaces), Theorem 2.7 no longer holds if H is infinite-dimensional: although 
it (almost trivially) remains true that density operators p on H define states on B(H) 
through the fundamental correspondence @(a) = Tr (pa), a € B(H), cf. (2.33), there 
are (many) states that are not given in that way (see below). Fortunately, states that 
do arise through (2.33) can be characterized in a simple way. 


Definition 4.11. A state @ : B(H) — C is called normal if for each orthogonal 
family (e;) of projections (i.e., e* = e; and eye; = d;je;) one has 


o (Ee = y(ei). (4.28) 


Here Y;e; is defined as the projection on the smallest closed subspace K of H that 
contains each e;H (that is, Ye; = Vie;, i.e. the supremum in the poset P(H) of all 
projections on H with respect to the partial order e < f iffeH C fH). Furthermore, 
the sum over i on the right-hand side is defined by (B.11), i.e., as the supremum (in 
R) of the set of all sums Vje7 @(e;) over finite subsets F C1 of the index set I in 
which i takes values. It is finite because VY jcp ei < 1x and hence, since @ is positive, 


y’ @(e;) < @(1y) = 1. 


icF 
For example, let (v;) be a basis of H with associated one-dimensional projections 
e; = |v) (dj. (4.29) 
If @ is assumed to be a state, then the additivity condition (4.28) implies 


y' o(e;) =1, (4.30) 


or, equivalently, using Definition B.6 etc. as well as the notation er = Vier ei, 

lim (er) =1. (4.31) 
If H is separable, any orthogonal family (e;) of projections is necessarily countable, 
and (4.28) is analogous to the countable additivity condition defining a measure. 


Theorem 4.12. A state @ on B(H) takes the form @(a) = Tr (pa) for some (unique) 
density operator p © F(A) iff it is normal. 
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Proof. First, eq. (2.33) implies (4.28). To see this, take the trace with respect to 
some basis (v;) of H that is adapted to the family (e;) in the sense that for each j, 
either e;D; = vj (i.e., 0; € eH) for one value of i, or e;0; = v; for all i. Then 


o (E«. =Tr (xs, = )i(v;,p Yeiv)) =Y(v;,p)), 


i J J 


where the sum y; is over those j for which v; € K = Vje;H. On the other hand, 
since the basis is adapted, we have v; € K iff there is an i for which e;v; = 0; (since 
otherwise e;v; = 0 and hence v; -L e;H for each i, so that vj € K+), so 


/ 


Ye olei) = VTr (pei) = Yo Yi (v;, peivj) = sup YL (j,P vj) = Yi (vj,p vj); 
i i ij Cc 


JCSp J 


where J consists of those j for which v; € Licr e;H. This gives (4.28). 
Conversely, assume @ is normal. For the e; in (4.28) we now take the projections 
(4.29) determined by some basis (v;). For each a € B(H) we then have 


(a) = lim @(era). (4.32) 


Indeed, using Cauchy—Schwarz for the positive semi-definite form (a,b) = @(a*b), 
as in (C.197), and using );e; = 1q and hence @(a) = @()e;a) we have 


|\@(a) — @(era)|* =|@(erca)|? < @(a*a)@(ere) < |lal|?@(ezc), (4.33) 


since ere = )igp e; i8 a projection. Since @(er) + @(erc) = O(1H) = 1, eq. (4.31) 
gives limp @(erc) = 0, so that (4.33) gives (4.32). For each finite F C I, the oper- 
ator era has finite rank and hence is compact. According to Theorem B.146, the 
restriction of @ : B(H) > C to the C*-algebra Bo(H) of compact operators on H is 
induced by a trace-class operator p, which (from the requirement that @ be a state) 
must be a density operator. Hence @(era) = Tr (pera), and we finally have 


(a) = lim (era) = lim Tr (pera) =Tr (pa). (4.34) 


To derive the final equality, we rewrite Tr (Oera) = Tr (erap), cf. (A.78) and Propo- 
sition B.144, note that ap € B,(H), as shown in Corollary B.147, and observe 
that for any b € B,(H) we have limp Tr(erb) = Tr(b). To see this, simply com- 
pute the trace in the basis (v;) defining the projections e; through (4.29), so that 
Tr (erb) = Vier (vi, bv;), and note that by Definition B.6, 


lim y (v;,bv;) = Y)(v;,b0;) = Tr (b). 


ic¢F i 


Finally, suppose @(a) = Tr(pia) = Tr(p2a) for each a € B(H) and hence for 
each a € Bo(H). It follows from (B.476) that Tr (pa) = 0 for all a € Bo(H) iff p =0. 
Hence Pp) = (2, i.e., a normal state @ uniquely determines a density operator p. 
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If @ is normal, we may therefore use the spectral resolution (2.6) of the corre- 
sponding density operator p, i.e., 9 = Y; p;|v;)(v;|, where (v;) is some basis of H 
consisting of eigenvectors of p (which exists because p is compact and self-adjoint), 
and the corrsponding eigenvalues satisfy p; > 0 and );; pi = 1; see the explanation 
after Definition B.148. Computing the trace in the same basis gives 


Tr (pa) =) pi(v;,av;). (4.35) 


We may characterize normality in a number of other ways. First note that because 
of the duality B;(H)* = B(H) of Theorem B.146, cf. (B.477), we may equip B(H) 
with the w*-topology in its role as the dual of the trace-class operators B,(H), see 
§B.9; this means that a, — a iff Tr(pa, ) > Tr (pa) for each p € Bi (A), or, equiva- 
lently, for each p € Y(H), since each trace-class operator is a linear combination of 
at most four density operators, as follows from Lemma C.53 with (C.8) - (C.9). The 
w*-topology on B(H), seen as the dual of B)(H), is called the o-weak topology. By 
Proposition B.46, the o-weakly continuous linear functionals @ on B(H) are just 
those given by @(a) = Tr (pb) for some trace-class operator b € By (H). 

Secondly, B(H) is monotone complete, in the sense that each net (a ) of positive 
operators that is bounded (i.e., 0 < ay <c-1y for some c > 0 and all A € A) and 
increasing (in that a, < aj: whenever A < A’) has a supremum a with respect to the 
standard ordering < on B(H)+, which supremum coincides with the strong limit of 
the net (.e., lim, a, Wy = ay for each y € H); the proof is the same as for Proposition 
B.98, and also here we write a, _/” a to describe this entire situation. 


Corollary 4.13. The following conditions on a state @ € S(B(H)) are equivalent: 


I. @ is normal, cf: Definition 4.11; 

2. w(a) =lim, (aq) ifa, Aa 

3. @(a) = Tr(pa) for some density operator p € D(H); 
4. @ is o-weakly continuous. 


Proof. We have seen | ++ 3 © 4, and 2 — | is obvious, so establishing 3 — 2 would 
complete the proof. To this effect, we first note that because the sum (4.35) is con- 
vergent, for € > 0 we may find a finite subset F CJ for which Yigg pi < €/2\la\| 
(assuming a # 0). Since 0 < ay < a also implies ay < ||a||- 14 (since a < |la|] - 14), 
we therefore have | Vigr pi(Vi, (aa — a) v;)| < 2€/3, uniformly in 2. Moreover, since 
F is finite and a, — a strongly, we can find Ao such that for all A > Ag we have 


| Y pi(v;, (a, —a)v;))| < €/3. (4.36) 
iCF 


Consequently, for such A, 


2 1 
[Tr (p (a, —4))| < | ¥ pilv;, (ag —a)v;)| +| Y pilv;, (aa —a)v;)| < get ge =e. 
ieF igF 


This shows that lim, |Tr (ep (aq —a))| = 0, so that assumption 3 implies no. 2. 
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We denote the normal state space of B(H), i.e., the set of all normal states on 
B(H) by S,,(B(H)). Itis easy to see from Definition B.148 that S,,(B(H)) is a convex 
(but not necessarily compact!) subset of the total state space S(B(H)). 


Corollary 4.14. The relation @(a) = Tr (pa) induces an isomorphism 
S,(B(H)) = D(A) (4.37) 
of convex sets (i.€., @ + Pp). Furthermore, for the corresponding pure states we have 
P,(B(H)) = Ai(H), (4.38) 


i.e., any pure state © on Bo(H), as well as any normal pure state on B(H), is given 
by ® = @y for some unit vector y € H, where @(a) = (W,aw), cf. (2.42). 


The proof of (4.38) is practically the same as in the finite-dimensional case. From 
Theorem B.146 we obtain another characterization of S,(B(H)) and hence of (H): 


Corollary 4.15. [f Bo(H) is the C*-algebra of compact operators on H, we have 


Sn(B(H)); (4.39) 
P(Bo(H)) = P, (B(H)), (4.40) 


in the sense that any (pure) state @ on Bo(H) has a unique normal extension to a 
(pure) state @' on B(H), given by the same density operator p that yields @. 


It can be shown that any state @ € S(B(H)) has a convex decomposition 
© =1@, + (1—1)a,, (4.41) 


where t € [0,1], @, is a normal state, and @, is called a singular state. In particular, 
since for ¢ € (0,1) the state @ is mixed, a pure state is either normal or singular. 

Singular states are not as aberrant as the terminology may suggest: such states are 
routinely used in the physics literature and are typically denoted by |A), where A lies 
in the continuous spectrum of some self-adjoint operator (that has to be maximal for 
this notation to even begin to make sense, see §4.3 below). Examples of such “im- 
proper eigenstates” are |x) and |p), which many physicists regard as idealizations. 
However, mathematically such states are at least defined, namely as singular pure 
states on B(H). The key to the existence of such states lies in Proposition C.15 and 
its proof, which should be reviewed now; we only need the case a* =a. 


Proposition 4.16. Let a = a* € B(H) have non-empty continuous spectrum, so that 
there is some A € (a) that is not an eigenvalue of a. Then @ (f(a)) = f(A) defines 
a pure state on A = C*(a), whose extension to B(H) by any pure state is singular. 


Proof. Normal pure states on B(H) take the form @y(b) = (y,by), where y € H is 
a unit vector and b € B(H). We know from Proposition C.14 that @, is multiplicative 
on C*(a). However, if some multiplicative state @ on C*(a) has the form @ = @y, 
then y must be eigenvector of a; cf. the proof of Proposition 2.3. 
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4.3 The Kadison-Singer Conjecture 


To obtain deeper insight into singular pure states, and as a matter of independent 
interest, we return to the Kadison—Singer problem, cf. §2.6. Recall that this problem 
asks if some abelian unital C*-algebra A C B(H) has the Kadison—Singer property, 
stating that a pure state @,4 on A has a unique pure extension @ to B(H). Here the is- 
sue is uniqueness rather than existence, since at least one such extension exists: since 
A is necessarily unital (with 14 = 1,7) and @, is a state on A, so that in particular 
@4(14) = ||@4|| = 1, Corollary B.41 gives the existence of a bounded extension @ 
satisfying @(177) = ||@|| = 1, which by Proposition C.5 is a state on B(H). Proposi- 
tion 2.22 then gives the existence of a pure extension @. As in the finite-dimensional 
case, the Kadison—Singer property forces A to be maximal (in the poset @(B(H)) of 
all abelian unital C*-subalgebras of B(H), ordered by inclusion): 


Proposition 4.17. [f some abelian unital C*-subalgebra A of B(H) has the Kadison- 
Singer property, then A is necessarily maximal. 


Proof. We use the Gelfand isomorphism A = C(P(A)), where P(A) is the pure state 
space of A, cf. Theorem C.8 and Proposition C.14. If A has the Kadison—Singer 
property and A C BC B(H), where B is an abelian unital C*-subalgebra A of B(H), 
then @, has a unique pure extension @ on B(H), which restricts to some state @g on 
B. The same reasoning as in the proof of Proposition 2.22 shows that @g is a pure 
state on B, so that we obtain a unique map 


P(A) + P(B); (4.42) 
Wa +> Op. (4.43) 


The inverse of this map is simply the pullback of the inclusion A — B, 1.e., @g € 
P(B) defines @4 € P(A) by restriction, so that we have a bijection P(A) = P(B), 
@, <> gp. Since for any pair of C*-algebras A C B the pullback S(B) — S(A) is 
continuous (in the pertinent w*-topology), the map @g +> @, is continuous. As in 
Lemma C.20, this implies that it is in fact a homeomorphism, so that A = B through 
the inclusion A <> B. This gives A = B, and hence A is maximal. 


Maximality of A implies A’ = A, so that A is a von Neumann algebra, sharing the 
unit of B(H). To see the relevance of singular states for the Kadison—Singer prob- 
lem, we first settle the normal case. We know what it means for a state on B(H) 
to be normal (cf. Definition 4.11 and Corollary 4.13); for arbitrary von Neumann 
algebras A C B(H) the situation is exactly the same: we define normality by (4.28) 
and characterize it by the equivalent properties in Corollary 4.13, where the o-weak 
topology on A may be defined either as the one inherited from B(H), or, more in- 
trinsically, and the w*-topology from the duality A = A}, where the Banach space 
A, is the so-called predual of A, e.g., (2 = ¢! and L*(0, 1), =L'(0, 1), cf. §B.9. 


Theorem 4.18. Let H be a separable Hilbert space and let W, be a normal pure 
state ona maximal commutative unital C*-algebra A in B(H). Then Ma has a unique 
extension to a state @ on B(H), which is necessarily pure and normal. 
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Proof. As noted after (4.41), a pure state on B(H) is either normal or singular. The 
possibility that @, is normal whereas @ is singular is excluded by Corollary 4.13.3, 
so @ must be normal and hence given by a density operator. The proof of uniqueness 
is then the same as in the finite-dimensional case, cf. Theorem 2.21. 


We now recall the classification of maximal maximal abelian *-algebras (and 
hence of maximal abelian von Neumann algebras) A in B(H) up to unitary equiva- 
lence (cf. Theorem B.118). This classification is the relevant one for the Kadison— 
Singer problem, since, as is easily seen, A C B(H) has the Kadison—Singer property 
iff wAu~! C B(uH) has it. The uniqueness of the finite-dimensional case will be lost: 


Theorem 4.19. If H is separable and infinite-dimensional, and A C B(H) is a maxi- 
mal abelian *-algebra, then A is unitarily equivalent to exactly one of the following: 


1. L®(0,1) C B(L?(0,1)); 
2. CB); 
3. L®(0,1) @ £"(K) C B(L?(0,1) 6 2(k)), 


where (* = (”(N), 2 = @(N), and kK is either {1,...,n}, in which case (K) =C" 
and (*(«) =D,(C), or k =N, in which case (Kk) = @ and €*(K) = &. 


This classification sheds some more light on Theorem 4.18. Since L®(0, 1) has no 
pure normal states and D,,(C) has been dealt with in Theorem 2.21, the interesting 
case is €*. Using Corollary 4.13.3 (or the analysis below), it is easy to check that 
the normal pure states on ¢* are given by @4(f) = f(x) for some x € N; these are 
vector state of the kind @,a(f) = (W,m fy) with y = 6,, or, in other words, they are 
given by @4(f) =Tr(pmy) with p = |d,)(6,|. We now invoke a fairly deep result: 


Proposition 4.20. A pure state @ on B(H) is singular iff one (and hence all) of the 
following equivalent conditions is satisfied: 


e (a) =0 for each a € Bo(H); 
e @(e) =0 for each one-dimensional projection e; 
e ); @(e;) = 0 for the projections e; = |v;)(v;| defined by some basis (v;). 


One direction is easy: a normal pure state certainly does not satisfy the condition 
in question. For example, given (2.42) one may take a = |y)(w|, which as a one- 
dimensional projection lies in By(H), so that @y (a) = 1. We omit the other direction 
of the proof. We conclude from this proposition that a pure singular state on B(¢*) 
cannot restrict to a normal pure state on @*, which reconfirms Theorem 4.18. 

We now study the Kadison—Singer property for each of the three cases in Theo- 
rem 4.19 (where the third will be an easy corollary of the first and the second). Since 
the proofs of the first two cases are formidable, we just sketch the argument. 


Theorem 4.21. e There exist (necessarily singular) pure states on L*(0,1) that do 
not have a unique extension to B(L*(0,1)), and similarly for L® (0,1) £°(k). 
e Any pure state on £* has a unique extension to B(0*). 
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The statement about ¢” is the Kadison—Singer Conjecture, which dates from 1959 
but was only proved in 2013. The first claim (which was already known to Kadison 
and Singer themselves) is equally remarkable, however, as is the contrast between 
the two parts of Theorem 4.21. In particular, Dirac’s notation |A) may be ambiguous. 
The key to the proof of the first claim lies in the choice of a total countable 
family of normal states on L”(0,1), from which all pure states may be constructed 
by a limiting operation. Here we call a (countable) family (@,)ncn of states on some 
C*-algebra A total if, for any self-adjoint a € A, the conditions @,(a) > 0 for eachn 
imply a > 0 (the converse is trivial). For example, the well-known Haar basis (hy) 
of L?(0,1) provides such a family. The functions forming this basis are defined via 
some bijection B between the set of pairs (k,/) and N, e.g., B(k,/) =k+2!, by 


hn = Xp-\(n), (WEN = {1,2,....}); (4.44) 
Xea(x) = 2*/?9(2kx—1), (ke NU{0},0<1 <2"); (4.45) 
g(x) = 1Yo,1/2) - Io. (4.46) 


Basic analysis then shows that the Haar functions h, form a basis of L7(0,1) and 
that the associated vector states @, on L®(0, 1) form a total set, where obviously 


1 
On(f) = (An, myhn) = [ af. (4.47) 


The relevance of total sets to the conjecture is explained by the following lemma. 


Lemma 4.22. If T Cc S(A) is a total set of states on a unital C*-algebra A, then 


S(A) = co(T)-; (4.48) 
P(A) CT-, (4.49) 


where co(T)~ is the w*-closure of the convex hull of T in A* or in S(A). 


Proof. The inclusion co(T)~ C S(A) is obvious, since T C S(A) and S(A) is a com- 
pact (and hence a closed) convex set. To prove the converse inclusion, suppose 
a=a* €Aands € Rare such that @(a) > s for each @ € T. Then @(a—s-14) >0 
and hence w(a) > s for each @ € S(A). Using Theorem B.43 (of Hahn—Banach 
type), this property would lead to a contradiction if S(A) were not contained in 
co(T)~. 

The second claim, which is the one we will use, follows from the first through a 
corollary of the Krein—Milman Theorem B.50, stating that if T C K is any subset of 
a compact convex set K such that K = co(T)~, then 0.K C T~. This corollary may 
be proved (by contradiction) from Theorem B.43 in a similar way. 


Our next aim is to get rid of the closure in (4.49). The Haar basis yields a map 


h:N > S(L*(0,1)); (4.50) 
nt> Mn, (4.51) 
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with image T, i.e., the set of Haar states. Since S(A) is a compact Hausdorff space (in 
its w*-topology), the universal property (B.135) of the Cech—Stone compactification 
BN of N implies that / extends (uniquely) to a continuous map 


Bh: BN > S(A), 


whose image is compact and hence closed (since BN is compact). Since T = h(N) C 
S(A) we have T C Bh(BN) and hence T~ C BA(BN), so that, from (4.49), 


P(L™(0,1)) € Bh(BN). (4.52) 
Hence each pure state @, = @;~(0,1) on L™ (0,1) takes the form @, = ol”), where 
of (f) =liman(f) = (| {on(f)|n eA}, f€L7(0,1), (4.53) 


ACU 


and U € BN is some ultrafilter on N, cf. (B.136). The point of this analysis, then, is 
that wy can immediately be extended to B(L7(0,1)) by the same formula, ice., 


oY) (a) = lim @, (a) = () {an,(a)|n € A}~, a€ B(L?(0,1)), (4.54) 
ACU 


where @,(a) = (hy, ah,). If L*(0, 1) had the Kadison-Singer property, this were the 
unique extension of @y, and we will show that this leads to a contradiction. 

Apart from the use of ultrafilters, the technically most challenging part of the 
argument disproving the Kadison—Singer property for L”(0, 1) is as follows. If A = 
C([0,1]), for any f € A and any pure state @ € P(A) there is some x € [0,1] such 
that @(f) = f(x); see Propositions C.14 and C.19. For A = L®(0, 1) the situation is 
not that simple due to measure zero complications. Nonetheless, it is easy to show 
that for each positive f € L*(0,1) and @, € P(L*(0,1)) and each € > 0 one has 


H({x € (0,1) | F(a) € [@c(f) — €, (Ff) + €]}) > 0. (4.55) 


where Ll is Lebesque measure on (0,1). Taking the projection 


€ = 1 fxe(0,1)| f(x) €lae(f)—€/2,0c(f)+€/2]}> 


it follows that for each positive f € L*(0,1), @ € P(L*(0,1)) and € > 0 there exists 
a projection e € Y(L”(0,1)) with @(e) = 1 and |lef — e@,(f)|| < €. Hard analysis 
then generalizes this property from L*(0,1) to B(L7(0, 1)), as follows: 


Lemma 4.23. /f @. € P(L®(0,1)) has a unique extension @ to B(L?(0,1)) (which is 
necessarily pure if it is unique), then for each a € B(L(0,1)) and € > 0 there exists 
a projection e € Y(L”(0,1)) with @-(e) = 1 and 


lea —ea(a)|| <eé. (4.56) 
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To derive a contradiction between (4.54) and (4.56), we use a bijection b: NN 
that cyclically permutes the ordered subsets (gt +1,... NN, k =0,1,..., that is, 
(1,2), (3,4), (5,6,7,8), (9,..., 16), etc. This bijection induces a unitary operator 


we OA SIO: (4.57) 
uhy = hyn), (4.58) 


which is easily shown to have the following properties: 


@,(u) = 0, nEN; (4.59) 
lleue|| = 1, e€ A(L*(0,1)),e £0. (4.60) 


To show that L*(0, 1) fails to have the Kadison—Singer property, suppose it does, so 


that any @ € P(L”(0,1)) has a unique extension @ € P(B(L*(0,1))). As already 


noted, we may then assume that @, = oY), as in (4.53), whilst @ = a), as in 


(4.54). Taking a = u then gives @(u) = 0, see (4.59), so that ||eu|| < € by (4.56). But 
this contradicts (4.60), finishing the sketch of the proof of the first claim in Theorem 
4.21. The remark about L”(0, 1) © €°() follows from the one about L”(0, 1). 

We now pass to the (even) more difficult case of @° C B(¢?). Although this will 
not be used in the proof, it gives some insight to know which states on €” we are 
actually talking about, i.e., the singular pure states, and compare this with (4.53). 


Theorem 4.24. There is a bijective correspondence 


@a(f) = [aus (4.61) 


between states Mg on £” and finitely additive probability measures & on N, where: 


1. @q is normal iff is countably additive (and hence is a probability measure). 
2. @gq is pure iff “ corresponds to some ultrafilter U on N, in which case: 
@, is normal iff U is principal (and hence singular iff U is free). 


This follows from case no. 5 in §B.9, notably eqs. (B.153) - (B.154). In other words, 
the pure states @, on &* are given by ultrafilters U on N through 


os (f) =BF(U) =lims(n); (4.62) 


the analogy with (4.53) is even clearer if we write f(n) = (6,,myf6,) = @,(f). If 
U =U, is a principal ultrafilter, 2 € N, we thus recover the normal pure states 


oy (f) =F (n). (4.63) 
As in (4.54), we find at least one natural extension @) of a”) to B(¢?), namely 


oa) (a) = lim@,(a). (4.64) 
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We now show that that @” has the Kadison-Singer property, making o) the 
only extension of a), The proof relies on an extremely difficult lemma from linear 
algebra (formerly known as a paving conjecture). We first define a linear map D : 


M,(C) > D,(C) by D(a) = aii, i= 1,...,n, and D(a);; =0 whenever i F j. 


Lemma 4.25. For any € > 0 there exist 1 € N such that for alln € N and a € M,(C) 
with D(a) = 0, there are | projections (e1,...,e;) in D,(C) such that 


l 
Yee = Ia (4.65) 

ke} 
llerae;|| < ellal|, i=1,...,0. (4.66) 


Since this estimate is uniform in n, the lemma extends to ¢, where D : B(¢) — ¢* 
is defined analogously, i.e., D(a) is diagonal in the canonical basis (6,) of ¢* with 


D(a)6n = @n(a)bn, n EN. (4.67) 
Lemma 4.26. For any € > 0 there exist 1 € N such that for all a € B(¢?) with D(a) = 
0, there are | projections (e1,...,e;) in £° such that 
l 
Y ex = 1a: (4.68) 
k=1 
lle:ae;|| < ellal|, i=1,...,0. (4.69) 


Now suppose that @, € P(¢”), that @ € S(B(¢7)) extends @,, and that a € B(¢7) has 
D(a) = 0. Let e; be one of the projections in Lemma 4.26. Using Cauchy—Schwarz 
for the sesquilinear form (a,b) = @(a*b), we obtain (using e? = e* = e;) 
|@(ejae;)|" < w(e;)@(eja*ae;); (4.70) 
|@(ejae;)|" < w(a*e;a)@(e;). (4.71) 
Since @(e;) = @g(e;) and @, is a pure state (and hence is multiplicative), we have 
o(e;) € {0,1}, since e; is a projection. Moreover, in view of (4.68) and the nor- 
malization @(1,) = 1, there must be exactly one value of i = 1,...,/, say i = io, 
such that @(e;,) = 1, and @(e;) = 0 for all i ¥ ip. Eqs. (4.70) - (4.71) there- 
fore imply that w(e;ae;) 0 iff i= j = ip. Using (4.68) once more, we see that 
(a) = Yj; O(e;ae;) = W(e;,ae;,), so that |@(a)| < ||@|||leigaeig|| < 1- Ellal| by 
(4.66). Letting € — 0, we proved: 
Lemma 4.27. If @ € S(B(@)) extends @y € P(¢*), and D(a) = 0, then w(a) =0. 


Since D* = D, we have D(a — D(a)) = 0, so that for any a € B(¢7), we have 
(a) = @(D(a)) = @a(D(a)), (4.72) 


provided that @ extends @,, as before. This shows that is determined by @g and 
hence is unique, completing the proof (sketch) of Theorem 4.21. 
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4.4 Gleason’s Theorem in arbitrary dimension 


To a large extent the thrust and difficulty of the proof of Gleason’s Theorem 2.28 
already lies in its finite-dimensional version, but some care is needed in the gen- 
eral case, and also Corollary 2.29 needs to be refined. A major point here is that 
Definition 2.23 has no unambiguous generalization to arbitrary Hilbert spaces. 


Definition 4.28. Let H be an arbitrary Hilbert space with unit sphere H,. 
1. A probability distribution on Y(H) is a map p : H, — (0, 1] that satisfies 


Y p(vi) = 1, for any basis (v;) of H, (4.73) 


ie] 


where, as in §B.12, the sum (over a possibly uncountable index set) is meant as 
in Definition B.6. In particular, if H is separable and the basis is labeled and 
ordered by I=N, then it is an ordinary convergent sum of the kind Yj", -+-. 

2. A map P: P(H) > [0,1] that satisfies Pn) = 1 is called a: 


a. finitely additive probability measure if 


is ( “) =) PCe;) (4.74) 


jes jel 


for any finite collection (e;) je of mutually orthogonal projections on H (i.e., 
ejH | exH, or equivalently, e je, = 0, whenever j # k); this is equivalent to 
the condition P(e + f) = P(e) + P(f) whenever ef =0, cf. Definition 2.23.2. 

b. probability measure if (4.74) holds for any countable collection (e;) jes of 
mutually orthogonal projections on H, where the first sum is defined in the 
strong operator topology; note that the strong sum }',e; coincides with the 
supremum \/ ; e; of the given family, defined with respect to the usual ordering 
of projections (that is, e < f iffeH C fH). 

c. completely additive probability measure if (4.74) holds for arbitrary col- 
lections (e;) jes of mutually orthogonal projections on H (the first sum again 
meant in the strong operator topology, with the same comment as above). 


Thus a probability measure is by definition o-additive in the usual sense of mea- 
sure theory; the other two cases are unusual from that perspective. However, if H is 
separable, then J can be at most countable, so that complete additivity is the same 
as O-additivity and hence any probability measure is completely additive. Surpris- 
ingly, assuming the Continuum Hypothesis (CH) of set theory, it can be shown that 
this is even the case for arbitrary Hilbert spaces. The fundamental distinction, then, 
is between finitely additive probability measures and probability measures (which 
by definition are countably additive). As we shall see, this reflects the distinction 
between arbitrary and normal states on B(H), respectively, cf. §4.2. In what fol- 
lows, in dealing with non-separable Hilbert spaces we assume CH, in which case 
probability distributions on H are equivalent to probability measures on Y(H). 


120 4 Quantum physics on a general Hilbert space 


The proof is the same as in finite dimension (taking into account that infinite sums 
over projections are defined strongly). Even without CH, Gleason’s Theorem still 
holds for non-separable Hilbert spaces if we assume P to be completely additive, and 
probability distributions are equivalent to completely additive probability measures 
on “(H). For separable Hilbert spaces, CH is irrelevant and unnecessary altogether. 
We then have the following generalization (and bifurcation) of Theorem 2.28. 


Theorem 4.29. Let H be a Hilbert space of dimension > 2. 


1. Each probability measure P on Y(H) is induced by a unique normal state on 
B(A) via (2.122), ie. 
P(e) =Tr(pe), (4.75) 


where p is a density operator on H uniquely determined by P. 
Equivalently, each probability distribution p on Y(H) is given by (2.123), or 


p(v) = (v,pv). (4.76) 


Conversely, each density operator p on H defines a probability measure P on 
P(A) via (4.75), as well as as a probability distribution p on P(H) via (4.76). 

. Each finitely additive probability measure P on P(H) is induced by a unique 
state @ on B(H) via 


NO 


P(e) = @(e), (4.77) 
and similarly each probability distribution p on P(H) is given by 


p(v) = w(ep). (4.78) 


Conversely, each state @ on H defines a probability measure P on Y(H) via 
(4.77), as well as as probability distribution p on Y(H) via (4.78). 


Proof. The proof of part 1 is practically the same as in finite dimension, except for 
the fact that in the proof of Lemma 2.33 the reference to Proposition A.23 should be 
replaced by Proposition B.79, upon which one obtains a bounded positive operator p 
for which (2.123) holds. The normalization condition (2.110) then yields Tr(p) = 1 
if the trace is taken over any basis of H, and since p is positive this implies p € 
B,(H), see §B.20 (complete additivity of P is just necessary to relate it to p). 
Unfortunately, the proof of part 2 exceeds the scope of this book (see Notes). 


In infinite dimension, Corollary 2.29 becomes more complicated, too; for one 
thing, Definition 2.26 of a quasi-state bifurcates into two possibilities. The one given 
still makes perfect sense and is natural from the point of view of Bohrification; to 
avoid confusion we call a map @ : B(H) — C satisfying the conditions in Defi- 
nition 2.26 a strong quasi-state. In the context of Gleason’s Theorem, a slightly 
different notion is appropriate: a weak quasi-state on B(H) satisfies Definition 2.26, 
except that linearity is only required on commutative C*-algebras in B(H) of the 
form C*(a), where a = a* € B(H) (these are singly generated). Since commutative 
unital C*-subalgebras of B(H) are not necessarily singly generated, and a specific 
counterexample exists, weak quasi-states are not necessarily strong quasi-states. 
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Proposition 4.30. The map © + © x4) gives a bijective correspondence between 
weak quasi-states @ on B(H) and finitely additive probability measures on P(H). 


Proof. For some finite family (e1,...,@,) of mutually orthogonal projections on H, 
add e9 = ly — ))je; if necessary and let a = Yi-0 Aje;, with all A; € R different. 
Then o(a) = {Ao,..., An}, so that C*(a) & C(o(a) = C"*! (cf. Theorem B.94) co- 
incides with the linear span of the projections e ;. If @ is a weak quasi-state, then it 
is linear on C*(a) and hence also on the e;, so that ©) (1) is finitely additive. 

Conversely, let ys be a finitely additive probability measure on Y(H). If a=a* € 
B(#) is given, using the notation (B.328) we symbolically define @ on a by 


ola)= i pile) (4.79) 


More precisely, for any € > 0 we use Corollary B.104 to define @¢(a) = Yi, Aim(ea;) 
and let @(a) = lime_,o Me (a); it follows from Lemma B.103 (or the theory underly- 
ing the Riemann-Stieltjes integral (4.79)) that this limit exists. Now let b,c € C*(a), 
so that b = f(a) and c = g(a) for certain f,g € C(o(a)), andb+c = (f+g)(a), cf. 
Theorem B.94. By (B.325) we therefore have @,(b+c) = V7, (f +8)(Ai)M(ea,), 
which, since (f + g)(Ai) = f (Ai) +. (Ai), again by (B.325) equals We(b) + We(c). 
Since this holds for every € > 0, letting € > 0 we obtain @(b+c) = a@(b) + a(c), 
making @ linear on C*(a). It is clear that the quasi-state @ thus obtained, on re- 
striction to A(H) reproduces ft, making the map @ +> | (4) Surjective. Finally, 
injectivity of this map follows from Corollary B.104. 


Corollary 4.31. [f dim(H) > 2, then each weak quasi-state on B(H) (and a fortiori 
each strong quasi-state) is linear and hence is actually a state. 


This is immediate from Theorem 4.29.2. and Proposition 4.30. 
Another corollary of Gleason’s Theorem is the Kochen—Specker Theorem, which 
we will explain in detail in Chapter 6, where it will also be proved in a different way. 


Theorem 4.32. /fdim(H) > 2, there are no weak quasi-states @ : B(H) — C whose 
restriction to each C*-subalgebra C* (a) C B(H) is pure (where a=a* € B(H)). 


Equivalently, there are no nonzero maps @’ : B(H)s, — R that are: 


e Dispersion-free, i.c., w!(a”) = w' (a)? for each a € B(A) sa; 
e Quasi-linear, i.e., linear on commuting operators. 


Cf. Definitions 6.1 and 6.3. To see that these conditions are equivalent to those stated 
in Theorem 4.32 (despite the impression that linearity on all commuting self-adjoint 
operators seems stronger than linearity on each C*(a)), extend w’ to @ : B(H) > 
C by complex linearity, as in Definition 2.26.1, and note that dispersion-freeness 
implies positivity and hence continuity on each subalgebra C* (a) (cf. Theorem C.52 
and Lemma C.4). We then see that the two conditions just stated imply that @ is 
multiplicative on C*(a), and hence pure, see Proposition C.14, which conversely 
implies that pure states on C*(a) are dispersion-free. We now prove Theorem 4.32. 
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Proof. If e is a projection, then e? = e, so that w(e”) = w(e). Since @ is dispersion- 
free (as just explained), we also have w(e?) = @(e)*, whence w(e)* = w(e) and 
hence @(e) € {0,1}. Furthermore, since @ is a state by Corollary 4.31, we may ap- 
ply the GNS-construction, see Theorem C.88 (whose notation we use). In particular, 
for any projection e, using the fact that 7»(e) = Tw(e)*w(e), by (C.196) we have 


0(e) = (Qe, e(e) Qe) = ||To(e)Qoll’. (4.80) 


If w@(e) = 0, then %m(e)Qy = 0 from the second equality. If @(e) = 1, then 
Te (e) Qo = Qe from the first inequality and Cauchy—Schwarz (in which we have 
equality, so that %(e¢)Qu = zQe for some z € T, upon which (4.80) forces z = 1). 

By the spectral theorem (e.g. in the form Corollary B.104) or the theory of von 
Neumann algebras, the linear span of Y(H) is norm-dense in B(H). Since Qe is 
cyclic for %»(B(H)) by the GNS-construction, it must be that Hy = C- Qo, and 
hence %»(a) = @(a)-1y4,, for any a € B(H). Since 1m(ab) = 1o(a)@(b) by the 
GNS-construction, this gives @(ab) = @(a)@(b) for all a,b € B(H). However, such 
multiplicative states@ on B(H) cannot exist if dim(H) > 1. This is clear if @ is 
normal, cf. Proposition 2.10, so that the following argument (which also covers the 
normal case) is especially meant for the case where @ is singular. 


1. If dim(H) =n < ©, there are n one-dimensional projections (e1,...,@,) such 
that );e; = ly. (indeed, we may assume that B(H) = M,(C) and take diagonal 
matrices e; = diag(1,0,...,0), etc.). Now for any pair (e;,e;) there is some v € 


B(H) (which by definition is a partial isometry) such that e; = vv*, ej = v*v (in 
the above case e; and e; are thus related if vj; = 1 and vj, = 0 otherwise). Hence 


o(e;) = a(vv") = @a(v)@(v") = a(v*v) = w(e;), (4.81) 


since @ is multiplicative. But @ is also additive, which implies 


; vie)=0 (Ser) = (ly) =1. (4.82) 
j=1 j=l 


J 


Since also @(e;) € {0,1}, eqs. (4.81) - (4.82) are clearly contradictory. 
2. If dim(H) =, separable or not, a similar contradiction arises from the halving 
lemma, which states that there is a projection e and an operator v such that e = 
vv*, 1y —e =v*v. For example, in the separable case assume H = ¢* and take e 
the projection onto the closed linear span ¢2 of the basis vectors (5,) with x €¢ N 
even, so that 1; — e projects onto the closed linear span (2 of the basis vectors 
(5,) with x € N odd. Then ( = (2.6 @; take v =0 on @ and v: 2 > @ any 
unitary operator. In general, a similar method works, for if J is a set indexing 
some basis of H one may find a subset E C / that has the same cardinality as its 
complement /\E, upon which ¢?(E) © (?(I\E), cf. Theorem B.63. 
Multiplicativity of @ then leads to similar contradiction between the properties 
o(e) = @(14 —e), as in (4.81), and w(e) + @(1y —e) = @(1H) = 1, as in (4.82): 
if @(e) = 0 one finds 0 = 1, whereas w(e) = | implies 2 = 1. 
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Notes 


§4.1. The Born rule from Bohrification (11) 

The Born measure (and its construction along the lines of this section) is well 
known in functional analysis, cf. Pedersen (1989), 84.5. For the Hamburger Mo- 
ment Problem see, for example, Reed, M. & Simon, B. (1975), Methods of Modern 
Mathematical Physics. Vol II. Fourier Analysis, Self-adjointness (New York: Aca- 
demic Press), Theorem X.4, p. 145 and Example 4, p. 205. In fact, the proof uses 
spectral theory! Corollary 4.6 was suggested by the treatment of the Born rule in 
Hall (2013). Definition 4.9 of the joint spectrum goes back (at least) to Arens (1961) 
and Hormander (1966), §3.1.13. 


$4.2. Density operators and normal states 
These are really results about von Neumann algebras and come from the pertinent 
literature; our proofs derive from Li (1992), §1.8 and Takesaki (2002), Ch. III. 


$4.3. The Kadison-Singer Conjecture 

As already mentioned in the notes to $2.6, the Kadison—Singer Conjecture was 
first discussed in Kadison & Singer (1959) and was finally proved by Marcus, Spiel- 
man, & Srivastava (2014ab), following important intermediate contributions by e.g. 
Anderson (1979) and Weaver (2004). For an introduction including a complete proof 
see Stevens (2016), and for applications of the conjecture and its proof to other ar- 
eas of mathematics see Casazza et al (2005) as well as Casazza & Tremain (2016). 
Proposition 4.20 is due to Glimm (1960). 


$4.4. Gleason’s Theorem in arbitrary dimension 

The extension of Gleason’s Theorem to non-separable Hilbert space assuming 
complete additivity of P is due to Maeda (1980). Maeda (1990) generalizes this 
result to von Neumann algebras without summands of type /,. The proof that as- 
suming CH countable additivity implies complete additivity (and hence Gleason’s 
Theorem) was given by Eilers & Horst (1975). Proposition 4.30 is due to Aarens 
(1970), whose Theorem | is wrong: see Aarens (1991). The proof of Theorem 4.32 
is due to Déring (2004), using results of Hamhalter (1993). 


Chapter 5 
Symmetry in quantum mechanics 


Roughly speaking, a symmetry of some mathematical object is an invertible trans- 
formation that leaves all relevant structure as it is. Thus a symmetry of a set is just a 
bijection (as sets have no further structure, whence invertibility is the only demand 
on a symmetry), a symmetry of a topological space is a homeomorphism, a sym- 
metry of a Banach space is a linear isometric isomorphism, and, crucially important 
for this chapter, a symmetry of a Hilbert space H is a unitary operator, i.e., a linear 
map u: H — H satisfying one and hence all of the following equivalent conditions: 


e uw =uv*u= ly; 

e wis invertible with u~! = u*: 

e uisa surjective isometry (or, if dim(H) < 9, just an isometry); 

e wis invertible and preserves the inner product, ie., (up, uw) = (9, W) (9, w EA). 
The discussion of symmetries in quantum physics is based on the above idea, but the 
mathematically obvious choices need not be the physically relevant ones. Even in el- 
ementary quantum mechanics, where A = B(H), i.e., the C*-algebra of all bounded 
operators on some Hilbert space H, the concept of a symmetry is already diverse. 
The main structures whose symmetries we shall study in this chapter are: 


1. The normal pure state space PY (H), i.e., the set of one-dimensional projections 
on H, with transition probability t: #1(H) x Y1(H) — [0,1] defined by (2.44). 
The normal state space Z(H), i.e. the convex set of density operators p on H. 
The self-adjoint operators B(H);, on H, seen as a Jordan algebra (see below). 
The effects &(H) = [0,1] gc), seen as a convex partially ordered set (poset). 
The projections Y(H) on H, seen as an orthocomplemented lattice. 

The unital commutative C*-subalgebras © (B(H)) of B(H), seen as a poset. 


ON. BN 


Each of these structures comes with its own notion of a symmetry, but the main 
point of this chapter will be to show these notions are equivalent, corresponding 
in all cases to either unitary or—surprisingly—anti-unitary operators, both merely 
defined up to a phase. The latter subtlety will open the world of projective unitary 
group representation to quantum mechanics (without which the existence of spin- 5 
particles such as electrons, and therewith also of ourselves, would be impossible). 


© The Author(s) 2017 125 
K. Landsman, Foundations of Quantum Theory, 
Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3_5 


126 5 Symmetry in quantum mechanics 


5.1 Six basic mathematical structures of quantum mechanics 


We first recall the objects just described in a bit more detail. We have: 


P\(H) = {e € B(H) | e* =e* =e,Tr(e) =dim(eH) = 1}; (5.1) 
(H) = {p € BIH) | p > 0,Tr(p) = 1}: 5.2) 
B(A) sa = {a € B(H) | a* =a}; (5.3) 
&(H) = {a € B(H) |O<a< ly}; (5.4) 
P(H) = {e € B(H) |e? =e* =e}; (5.5) 

6 (B(H)) = {C Cc B(A) | C commutative C*-algebra, 17 € C}. (5.6) 


The point is that each of these sets has some additional structure that defines what it 
means to be a symmetry of it, as we now spell out in detail. 


Definition 5.1. Let H be a Hilbert space (not necessarily finite-dimensionall). 


J, AWigner symmetry (of H) is a bijection 


that satisfies 
Tr(W(e)W(f)) = Trlef), ef © Ai(A). (5.8) 


2. A Kadison symmetry is an affine bijection 
K: D(H) > F(A), (5.9) 
i.e. a bijection K that preserves convex sums: for t € (0,1) and p1,P2 € (HA), 
K(tp; + (1—1)p2) =tKp; + (1—1)Kpo. (5.10) 
3. a. A Jordan symmetry is an invertible Jordan map 
J: B(A) 52 > B(A) sa, (5.11) 


i.e., an R-linear bijection that satisfies the equivalent conditions 


J(aob) = J(a)oJS(b); (5.12) 
J(a*) = JS(a)’. (5.13) 

Here 
aob=3(ab+ba) (5.14) 


is the Jordan product on B(H) sa, which turns the (real) vector space B(H) sa 
into a Jordan algebra, cf. §C.25. 

b. A weak Jordan symmetry is an invertible weak Jordan map, i.e., a bijection 
(5.11) of which the restriction Jic,, is a Jordan map for each C € @(B(H)). 
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4. A Ludwig symmetry is an affine order isomorphism 

L: &(H) > &(A). (5.15) 
5. A von Neumann symmetry is an order isomorphism 

N: A(A) > A(H) (5.16) 


preserving orthocomplementation, i.e. N(1 —e) = 1—N(e) for eache € P(H). 
6. A Bohr symmetry is an order isomorphism 


B: ¢(B(H)) + @(B(H)). (5.17) 


In nos. 3 and 5-6, an order isomorphism O of the given poset is a bijection that 
preserves the partial order < (i.e., if x < y, then O(x) < O(y)) and whose inverse 
O-! does so, too; cf. §D.1. The names in question have been chosen for historical 
reasons and (except perhaps for the first and third) are not standard. 

Let us note that any Jordan map has a unique extension to a C-linear map 


Jc: B(H) > B(A); (5.18) 
Jc (a") = Ica)", (5.19) 


which satisfies (5.12) for all a,b, as well as 
Jc(at+ib) = J(a) +iJ5(b), (5.20) 


with notation as in Proposition 2.6. Conversely, such a Jordan map (5.18) defines 
a real Jordan map (5.11) by J = Jigcz,,- Similarly, a weak Jordan symmetry is 
equivalent to a map (5.18) that satisfies (5.19), preserves squares as in (5.13), and is 
linear on each subspace C of B(H), with C € @(B(H)). In other words (in the spirit 
of Bohrification), Jc is a homomorphism of C*-algebras on each commutative unital 
C*-subalgebra C C B(H). Therefore, either way J and Jc are essentially the same 
thing, and if no confusion may arise we call it J. Note that a weak Jordan map J a 
priori satisfies (5.12) only for commuting self-adjoint a and b. It follows that weak 
(and hence ordinary) Jordan symmetries are unital: since 


J(b) = S(1y 0b) = S(1y) 0 S(b) (5.21) 
for any b, we may pick b = J~!(1,) to find, reading (5.21) from right to left, 
Ja) =JU4)o la = 1p. (5.22) 


The special role of unitary operators u now emerges: each such operator defines 
the relevant symmetry in the obvious way, namely, in order of appearance: 


128 5 Symmetry in quantum mechanics 


W(e) = ueu"; (5.23) 
K(p) = upu"; (5.24) 
L(a) = uau*; (5.25) 
J(a) = uau"; (5.26) 
N(e) = ueu"; (5.27) 
B(C) = uCu", (5.28) 


where a* = a in (5.26). If not, this formula remains valid also for the map Jc. Fur- 
thermore, in (5.28) the notation uCu* is shorthand for the set {uau* | a € C}, which 
is easily seen to be a member of @(B(H)). Here, as well as in the other three cases, 
itis easy to verify that the right-hand side belongs to the required set, that is, 


ueu” € P\(H), upu® € D(A), upu*® € &(A), (5.29) 
uau* € B(H)sa, upu" © A(H), uCu* € @(B(A)), (5.30) 


respectively, provided, of course, that 
e€ A\(A), pE AWA), ac &(H) ac B(A)a, e€ AH), CE @(B(A)). 
Indeed, if, in (5.23), e = ew = |) (W| for some unit vector y € H, then 
Ueyl = eyy. (5.31) 


If p > 0 in that (yw,p yw) > 0 for each wy € H, then clearly also upu* > 0, and if 
Tr(p) = 1, then also Tr (upu*) = 1. If a* =a, then 


(uau*)* =u"*a*u* = uau". (5.32) 
However, one may also choose u in these formulae to be anti-unitary, as follows: 


Definition 5.2. /. A real-linear operator u: H — H is anti-linear if 
u(zw) =zy (ze C). (5.33) 


2. An anti-linear operator u: H — H is anti-unitary if it is invertible, and 


(up,uw) =(,Y) (9, WeH). (5.34) 
The adjoint u* of a (bounded) anti-linear operator u is defined by the property 


(u*@, VW) = (9,uy) (9, yw EH), (5.35) 


in which case u* is anti-linear, too. Hence we may equally well say that an anti-linear 
operator is anti-unitary if wu* = u*u = 1y. The simplest example is the map 
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FC Sc 
Jz=Z, (5.36) 
Le., if z= (z1,...,Zn) € C”, then (Jz); = Z;. Similarly, one may define 
J: 3 e. 
Jyw=V, (5.37) 


and likewise on L?, where complex conjugation is defined pointwise, that is, 


(Jw)(x) = yx). (5.38) 


For any Hilbert space one may pick a basis (v;) and define J relative to this basis by 


J (Ea») = 50). (5.39) 


For future use, we state two obvious facts. 


Proposition 5.3. 1. The product of two anti-unitary operators is unitary. 
2. Any anti-unitary operator u: H — H takes the form u = Jv, where v is unitary 
and J is an anti-unitary operator on H of the kind constructed above. 


It is an easy verification that (5.23) - (5.28) still define symmetries if u is anti- 
unitary. Note that in terms of the complexification Jc, eq. (5.26) should read 


Jc(a) =ua*u*. (5.40) 


The goal of the following sections is to show that these are the only possibilities: 
Theorem 5.4. Let H be a Hilbert space, with dim(H) > 1. 


I. Each Wigner symmetry takes the form (5.23); 
2. Each Kadison symmetry takes the form (5.24); 
3. Each Ludwig symmetry takes the form (5.25); 
4. a. Each Jordan symmetry takes the form (5.26); 

b. If dim(H) > 2, also each weak Jordan symmetry takes this form; 
5. If dim(H) > 2, each von Neumann symmetry takes the form (5.27); 
6. Again if dim(H) > 2, each Bohr symmetry takes the form (5.28), 


where in all cases the operator u is either unitary or anti-unitary, and is uniquely 
determined by the symmetry in question up to a phase (that is, u and u' implement 
the same symmetry by conjugation iff u' = zu, where z € T). 


As we shall see, the reason why the case H = C? is exceptional with regard to weak 
Jordan symmetries, von Neumann symmetries, and Bohr symmetries is that in those 
cases the proof relies on Gleason’s Theorem, which fails for H = C?. 

To see this more explicitly, and also to prove the positive cases (i.e., nos. 14a) in 
a simple situation without invoking higher principles, before proving Theorem 5.4 
in general it is instructive to first illustrate it in the two-dimensional case H = C?. 
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5.2 The case H = C2 


We start with some background. Any complex 2 x 2 matrix a can be written as 


3 
a = a(xo,%1,%2,%3) =4 s, XpOu (Xp € C); (5.41) 


U=' 
10 01 0 -i 1 0 
(51) EG ee este (5.42) 


i.e., the Pauli matrices. Furthermore, if we equip the vector space M7(C) of complex 

2 x 2 matrices with the canonical inner product (2.34), then the rescaled matrices 

Oy =O, / V2 form a basis (= orthonormal basis) of the ensuing Hilbert space. 
Writing x = (x1,x2,x3), some interesting special cases are: 


e x0 ER, x =iv with v € R? and x9 + vj +3 +3 = 1, which holds iff a € SU(2); 
e x, €R for each up =0,1,2,3, which is the case iff a* = a. 
e x0 =1,x€ R®, and ||x|| = 1, which holds iff a is a one-dimensional projection. 


The first case follows because SU (2) consist of all matrices of the form 


Qa 

( ae) a,B EC, jal? +|B)? =1. (5.43) 
—B a 

The second case is obvious, and the third follows from Proposition 2.9. 

Assume the third case, so that a = e with e? = e* = e and Tr(e) = 1. If a linear 
map u: C? —> C? is unitary, then simple computations show that e’ = ueu* is a one- 
dimensional projection, too, given by e’ = bYnotn Oy with xj = 1, x’ € R3, and 
||x’|| = 1. Writing x’ = Rx for some map R : S? — S?, we have 


u(x-0)u* = (Rx)-o, (5.44) 


where x- oO = Re j0;. This also shows that R extends to a linear isometry R : 
IR? — R°*. Using the formula Tr (o;0;) = 26;;, the matrix-form of R follows as 


Rij = 5Tr (uoju*o;). (5.45) 


Define U(2) as the (connected) group of all unitary 2 x 2 matrices (whose connected 
subgroup SU (2) of elements with unit determinant has just been mentioned). Also, 
recall that O(3) is the group of all real orthogonal 3 x 3 matrices M, a condition that 
may be expressed in (at least) four equivalent ways (like unitarity): 


MM! =M™M = 13; 

M invertible and M? = M—!; 

M is an isometry (and hence it is injective and therefore invertible); 
M preserves the inner product: (Mx, My) = (x,y) for all x,y € R°. 
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This implies det(M) = +1 (as can be seen by diagonalizing M; being a real linear 
isometry, its eigenvalues can only be +1, and det(M) is their product). Thus O(3) 
breaks up into two parts O+(3) = {R € O(3) | det(R) = £1}, of which O1 = SO(3) 
consists of rotations. Using an explicit parametrization of SO(3), e.g., through Euler 
angles, or, using surjectivity of the exponential map (from the Lie algebra of SO(3), 
which consist of anti-symmetric real matrices), it follows that O+(3) are precisely 
the two connected components of O(3), the identity of course lying in O, (3). 


Proposition 5.5. The map u++ R defined by (5.44) is a homomorphism from U (2) 
onto SO(3). In terms of SU(2) C U(2), this map restricts to a two-fold covering 


ft : SU(2) + SO(3), (5.46) 


with discrete kernel 


ker(#) = {19,—1p}. (5.47) 


Proof. As a finite-dimensional linear isometry, R is automatically invertible (this 
also follows from unitarity and hence invertibility of u), hence R € O(3). It is ob- 
vious from (5.44) that u+> R is a continuous homomorphism (of groups). Since 
U(2) is connected and u++ R is continuous, R must lie in the connected component 
of O(3) containing the identity, whence R € SO(3). To show surjectivity of 7, take 
some unit vector u € R?3 and define u = cos(}6) +isin(40)u-o. The corresponding 
rotation Rg (u) is the one around u by an angle 9, and such rotations generate SO(3). 

Finally, it follows from (5.44) that uw € ker(#) iff wu commutes with each o; and 
hence, by (5.41), with all matrices. Therefore, u = z- 12 for some z € C, upon which 
the the condition det(w) = 1 (in that u € SU(2)) enforces z = £1. 


Note that the covering (5.46) is topologically nontrivial (i.e., SU(2) # SO(3) x Za), 
since SU (2) & S? is simply connected, whereas SO(3) is doubly connected: a closed 
path t+ Rog(u), t € [0,1] in SO(3) (starting and ending at 13) lifts to a path 


tH cos(at) +isin(at)u-o 


in SU (2) that starts at the unit matrix 1 and ends at —1). 

To incorporate O_(3), let U,(2) be the set of all anti-unitary 2 x 2 matrices. 
These do not form a group, as the product of two anti-unitaries is unitary, but the 
union U(2) UU, (2) is a disconnected Lie group with identity component U(2). 


Proposition 5.6. The map u+> R defined by (5.44) is a surjective homomorphism 

ft! : U(2) UUg(2) + O(3), (5.48) 
with kernel U(1), seen as the diagonal matrices z-12, z € T. Moreover, it’ maps 
U(2) onto SO(3) and maps U,(2) onto O_(3). 


Proof, The map u ++ R in (5.44) sends the anti-unitary operator u = J on C? to 
R=diag(1,—1,1) € O_ (3). Since U,(2) =J-U(2) and similarly O_ (3) =R-SO(3), 
the last claim follows. The computation of the kernel may now be restricted to U(2), 
and then follows as in the last step op the proof of the previous proposition. 
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We now return to Theorem 5.4 and go through its special cases one by one. 
Part 1 of Theorem 5.4 is Wigner’s Theorem, which in the case at hands reads: 


Theorem 5.7. Each bijection W : YP; (C*) + YA; (C7) that satisfies 
Tr (W(e)W(f)) = Tr(ef) (5.49) 


for each e, f € A\(C?) takes the form W(e) = ueu*, where u is either unitary or 
anti-unitary, and is uniquely determined by W up to a phase. 


To prove, this we transfer the whole situation to the two-sphere, where it is easy: 
Proposition 5.8. The pure state space P(C”) corresponds bijectively to the sphere 


S* = {(x,y,z) ER |x? +y? +2? = 1}, 


in that each one-dimensional projection e € Y\(C*) may be expressed uniquely as 


1+zx-i 
e(x,y,z) =4 Ge es) (5.50) 


where (x,y,z) € R? and x* +y? +27 = 1. Under the ensuing bijection 
A\(C’) = S?, (5.51) 
Wigner symmetries W of C? turn into orthogonal maps R € O(3), restricted to S?. 


Proof. The first claim restates Proposition 2.9. If y and yw’ are unit vectors in C? 
with corresponding one-dimensional projections ey (x,y,z) and ey’ (x’,y’,z’) then, as 
one easily verifies, the corresponding transition probability takes the form 


Tr (eye) = 4(1 + (x,x’)) = cos”($0(x,y)), (5.52) 


where 0(x,y) is the arc (i.e., geodesic) distance between x and y. Consequently, 
if W: A\(C*) + A;(C7) satisfies (5.8), then the corresponding map R : S* — S? 
(defined through the above identification Y, (C*) & S*) satisfies 


(R(x),R(x’)) = (x,x’) (x,x’ € S’). (5.53) 


Lemma 5.9. /f some bijection R : S* — S? satisfies (5.53), then R extends (uniquely) 
to an orthogonal linear map (for simplicity also called) R : R? > R?. 


Proof, With (u;,u2,u3) the standard basis of R°, define a 3 x 3 matrix by 
Ry = (uz, R(u)))- (5.54) 


It follows from (5.53) that R-! (uj), = R jx, which implies (R~!(u;),x) =D, Rjaxe, 
or, once again using (5.53), R(x); = VR jxxx- Hence the map x+> 0; Rjxxxulj, Le., 
the usual linear map defined by the matrix (5.54), extends the given bijection R. 
Orthogonality of this linear map is, of course, equivalent to (5.53). 
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Wigner’s Theorem then follows by combining Propositions 5.6 and 5.8: given 
the linear map R just constructed, read (5.44) from right to left, where u exists by 
surjectivity of the map (5.48), and the precise lack of uniqueness of u as claimed in 
Theorem 5.4 is just a restatement of the fact that (5.48) has U(1) as its kernel. 


Kadison’s Theorem is part 2 of Theorem 5.4. Explicitly, for H = C? we have: 


Theorem 5.10. Each affine bijection K: P(C*) + F(C?) is given as K(p) =upu*, 
where u is unitary or anti-unitary, and is uniquely determined by K up to a phase. 


Proof. We once again invoke Proposition 2.9, implying that any density matrix p 
on C? takes the form 
3 
p=h( b+) xyoH% |}, (5.55) 
H=1 

with ||x|| < 1. Moreover, the ensuing bijection P(C*) © B, p ++ x, is clearly affine, 
in that a convex sums tp + (1 —t)p’ of density matrices correspond to convex sums 
tx + (1 —1)x’ of the corresponding vectors in R?. 


Lemma 5.11. Any affine bijection K of the unit ball B? in R°? is given by an orthog- 
onal linear map R € O(3). 


Proof, First, K must map the boundary 0,B* = S* to itself (necessarily bijectively): 
if x € S? and K(x) = tx’ + (1 —1)x”, then x =1K~!(x’) + (1 —1)K~!(x”), whence 


K-1(x') = K"!(x"), (5.56) 


since x is pure, whence x’ = x”, so that also K(x) is pure. 
Second, the basis of all further steps is the property 


K(0) =0. (5.57) 


This is because 0 is intrinsic to the convex structure of B?: it is the unique point 
with the property that for any x € S” there exists a unique x’ such that }x +x’ =0, 
namely x’ = —x. Thus 0 must be preserved under affine bijections. For a formal 
proof (by contradiction), suppose K(0) 4 0, and define y = K(0) /||K(0)|| €.S?. Then 
K(0) has an extremal decomposition K(0) = ty + (1 —1)y’, with y’ = —y and ¢ = 
1(1+||K(0)||). Applying the affine map K~! then gives 


1+(IK(0)| 


-1/,/\]) — -1 TT Ni/a\ll 
KOO) = IK Ol R@y 


Now y € S* and hence K~!(y) € S* by part one of this proof (applied to K~!), so 
that ||K~!(y)|] = 1. But this implies ||K~!(y’)|| > 1, which is impossible because 
y’ € S* and hence ||K~!(y’) |] = 1. 

Third, for x € B’ and t € [0,1] the preceding point implies that 


K(tx) = K(tx + (1 —1)0) =tK(x) + (1 —1t)K(O) = rK(x). (5.58) 
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The same then holds for x € B? and all t > 0 as long as fx € B?: for take t > 1, so 
that «~! € (0,1), and use the previous step with x ~» tx and t ~» t~! to compute 


K(tx) = tt 'K(tx) =1K(t7!tx) = 1K(x). 
Also, (5.58) and affinity imply that for any x,y € B? for which x+y € B?, we have 
K(x +y) = 2K(3x+ 5y) =2-(3K(x) + 3K(y)) = K(x) + K(y). (5.59) 
With our earlier result (5.57), this also gives 
K(—x) = —K(x). (5.60) 
For some nonzero x € R°, take s > ||x|| and ¢ > ||x||. Then by (5.58) we have 


sK(x/s) =sK ( 


tx 

-—}] =tK(x/t). 
ec 
We may therefore define a map R: R* > R? by 


R(0) = 0; (5.61) 
R(x) = s-K(x/s) (x £0), (5.62) 


for any choice of s > ||x||. For x € B° we may take s = 1, so that R extends K. 
To prove that R is linear, for x € R? and t > 0 pick some s > t||x|| and compute 


t x 


R(tx) = 5K (=x) = 5K (isi: ai) ane Ixi|-K (Gy) =1R(x). (5.63) 
For t < 0, we first show from (5.60) and (5.62) that 
R(—x) = —R(x), (5.64) 
upon which (5.63) gives 
R(tx) = R(t -(—¥)) = |1R(—x) = —|r|R(®) =—2R(). (5.65) 


Furthermore, for given x,y € B?, pick s’ > 0 such that s’ > ||x|| and s’ > |ly||, so that 
5 = 2s' > ||x+-y|| by the triangle inequality, and use (5.59) to compute 


R(x+y) =sK (=) =sK (= re *) = sK(x/s) +sK(y/s) 
= R(x) +R(y). (5.66) 


Finally, R is an isometry by (5.62) and step one of the proof. Being also linear and 
invertible, R must therefore be an orthogonal transformation. 
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Given step one, an alternative proof derives this lemma from Proposition 5.18 below, 
which shows that the transition probabilities (5.52) on S* are determined by the 
convex structure of B?, so that affine bijections must preserve them. In other words, 
the boundary map S* —> S$? defined by K preserves transition probabilities and hence 
satisfies the conditions of Lemma 5.9. This reasoning effectively reduces Kadison’s 
Theorem to Wigner’s Theorem, a move we will later examine in general. 

In any case, Theorem 5.10 now follows from Lemma 5.11 is exactly the same 
way as Theorem 5.7 followed from the corresponding Lemma 5.9. 


We have given this proof in some detail, because step 3 will recur on other occasions 
where a given affine bijection is to be extended to some linear map. 


Ludwig’s Theorem is part 3 of Theorem 5.4. For H = C’, we have: 


Theorem 5.12. Each affine order isomorphism L: &(C?) + &(C?) reads L(a) = 
uau*, where u is unitary or anti-unitary, and is uniquely fixed by L up to a phase. 


Proof, Using the parametrization (5.41), we have a(xo,x1,%2,x3) € &(C’) iff each 
Xy is real and 0 < xo +||x|| <2. In particular, we have 0 < xo < 2. This easily follows 
from (2.38), noting that a € &(C7) just means that a* = a and that both eigenvalues 
of a lie in [0,1]. Thus &(C7) is isomorphic as a convex set to a convex subset C of 
R* that is fibered over the xo-interval [0,2], where the fiber Cy, of C over xo is the 
three-ball B with radius ||x|] = x9 as long as 0 < x9 < 1, whereas for 1 < x9 <2 
the fiber is Byes 
this convex body is easily visualizable as a double cone in R*, where the fibers are 
disks). The partial order on C induced from the one on &(C”) is given by 


so at x9 = 1 the fiber is C,} = B? = B} (in one dimension less, 


(x0,X) < (x0,%’) iff x9 — x0 = |x’ —x\, (5.67) 


which follows from (5.41) and (2.38), noting that for matrices one has a < a’ iff 
a’ — a has positive eigenvalues. A similar argument to the one proving (5.57) then 
shows that any affine bijection L of C must map the base space [0,2] to itself (as 
an affine bijection), and hence either xg ++ xo or x9 +> 2 — xo. The latter fails to 
preserve order, so L must fix x9. Similarly, L maps each three-ball C,, to itself by 
an affine bijection, which, by the same proof as for Kadison’s Theorem above, must 
be induced by some element R,, of O(3). Finally, the order-preserving condition 
Xo —X0 => ||x’ —x|| > x9 -—x0 > |Rux’ — RxoX|| obtained from (5.67) and the property 


L(xo) = xo just found can only be met if R,, is independent of xo. 


Part 3 of Theorem 5.4 does not carry an official name; it may be attributed to Kadi- 
son, too, but the hard part of the proof was given earlier by Jacobson and Rickart. 
Rather than a contrived (though historically justified) name like “Jacobson-Rickart— 
Kadison Theorem”, we will simply speak of Jordan’s Theorem (for H = C?): 


Theorem 5.13. Each linear bijection J: M2(C)sa  Mz(C)sa that satisfies (5.13) 
and hence (5.12) takes the form J(a) = uau*, where u is either unitary or anti- 
unitary, and is uniquely determined by J up to a phase. 
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Proof. First, any Jordan map (and hence a fortiori any Jordan automorphism) 
trivially maps projections into projections, as it preserves the defining conditions 
e? = e* =e. Second, any Jordan automorphism J maps one-dimensional projections 
into one-dimensional projections: if e € Y;(H), then J(e) £0 and J(e) F Io, both 
because J is injective in combination with J(0) = 0 and J(12) = 1a, respectively. 
Hence J(e) € Y\(H), since this is the only remaining possibility (a more sophisti- 
cated argument shows that this is even true for any Hilbert space H). From (5.41) 
and subsequent text, as in (5.44), by linearity of J we therefore have 


3 3 

J (221) = (Rx) jj, (5.68) 
j=l j=l 

from some map R : S? —-+ S?, which is bijective because J is. Linearity of J then 

allows us to extend R to a linear map R* —> R3, with matrix 


3 
Rix = 4 Y Tr(O%J(05)), (5.69) 


cf. (5.45). By (5.69), this linear map restricts to the given bijection R : S* > S?, 
which also shows that it is isometric. Thus we have a linear isometry on R?, which 
therefore lies in O(3). The proof may then be completed as in Theorem 5.7. 


The case H = C? was already exceptional in the context of Gleason’s Theorem, and 
it remains so as far as weak Jordan symmetries and Bohr symmetries are concerned. 


Proposition 5.14. The poset @(M(C)) is isomorphic to {}URP?, where the real 
projective plane RF? is the quotient S* / ~ under the equivalence relation x ~ —x, 
and the only nontrivial ordering is . < p for any p € RP”. 


Proof. Itis elementary that M>(C) has a single one-dimensional unital *-subalgebra, 
namely C- 1, the multiples of the unit; this gives the singleton L in @(M2(C)). 
Furthermore, any two-dimensional unital *-subalgebra C of Mz(C) is generated 
by a one-dimensional projection e, in that C is the linear span of e and 12. Hence C 
is also the linear span of (the projection) 12 — e and 1». In our parametrization of all 
one-dimensional projections e on C” by S? (cf. Proposition 2.9), if e corresponds to 
x, then | — e corresponds to —x. This yields the remainder RP? of @(M2(C)). 
Finally, commutative unital *-subalgebras D of M2(C) of dimension > 2 do not 
exist. For any such algebra D would contain some two-dimensional C just defined, 
but a simple computation (for example, in a basis were C consists of all diagonal 
matrices) shows that the only matrices that commute with all elements of C already 
lie in C (i.e., are diagonal). Hence no commutative extension of C exists. 


Bohr symmetries B for C? therefore correspond to bijections of RP*. Similarly, 
weak Jordan symmetries J for C* corresponds to bijections of S* (the difference 
with Bohr symmetries lies in the fact that J may also map C = span(e, 12) to itself 
nontrivially, i.e., by sending e to 12 —e, which for B would yield the identity map). 
In both cases, few of these bijections are (anti-) unitarily implemented. 
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5.3 Equivalence between the six symmetry theorems 


If dim(H) > 1, the first three claims of Theorem 5.4 are equivalent; if dim(H) > 2, 
all claims are. We will show this in some detail, if only because the proofs of the 
various equivalences relate the six symmetry concepts stated in Definition 5.1 in 
an instructive way. We will do this in the sequence Wigner +> Kadison +> Jordan, 
and subsequently Jordan ++ Ludwig, Jordan <++ von Neumann, and Jordan + Bohr. 
Consequently, in principle only one part of Theorem 5.4 requires a proof. Although 
redundant, we will, in fact, prove both Wigner’s Theorem and Jordan’s (indeed, no 
independent proof of the other parts of Theorem 5.4 seems to be known!). The most 
transparent way to state the various equivalences is to note that in each case the set 
of symmetries of some given kind (i.e., Wigner, ...) forms a group. In all cases, the 
nontrivial part of the proof is the establishment of a “natural” bijection, from which 
the group homomorphism property is trivial (and hence will not be proved). 


Proposition 5.15. There is an isomorphism of groups between: 


e The group of affine bijections K: Z(H) + Y(H); 
e The group of bijections W: Y\(H) > Y;(A) that satisfy (5.8), viz. 


W= K\ a, (H); (5.70) 
K (Ea) = PAW(vy), (5.71) 


where Pp =; Ajep, is some (not necessarily unique) expansion of p € D(H) in terms 
of a basis of eigenvector v; with eigenvalues A;, where A; > 0 and A; = 1. In 
particular, (5.70) and (5.71) are well defined. 


Proof. Itis conceptually important to distinguish between B(H),, as a Banach space 
in the usual operator norm || - ||, and By (H).a, the Banach space of trace-class oper- 
ators in its intrinsic norm || - ||;. Of course, if dim(H) < 9, then B(H)s, = Bi (H)sa 
as vector spaces, but even in that case the two norms do not coincide (although 
they are equivalent). The proof below has the additional advantage of immediately 
generalizing to the infinite-dimensional case. We start with (5.70). 


1. Since Y\(H) = 0,F(H), by the same argument as in the proof of Lemma 5.11, 
any affine bijection of the convex set Y(H) must preserve its boundary, so that 
K maps Y| (H) into itself, necessarily bijectively. The goal of the next two steps 
is to prove that (5.70) satisfies (5.8), i.e., preserves transition probabilities. 

2. An affine bijection K: M(H) — Y(H) extends to an isometric isomorphism Ky : 
B,(H)s, > B,(H)sq with respect to the trace-norm || - ||, as follows: 


a. Put Ki(0) = 0 and for b > 0, b € Bi (AH), i.e. b € Bi (H)+, and b £0, define 


Ki(b) = ||b||1K(6/||o||1)- (5.72) 
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By construction, K, is isometric and preserves positivity. For b € B\(H)+ we 
have Tr (b) = ||b||1, hence b/||b||, € D(H), on which K is defined. 
Linearity of K; with positive coefficients (as a consequence of the affine prop- 
erty of K) is verified as in the proof of Lemma 5.11; this time, use 


a b 
a+b=(lali +1blh): (7 40-0), 6.73) 
lalla ai 
with t = |la||1/(lal|1 +||b||1). Note that ifa,b € B,(H)+, thena+be€B,(H),. 
b. For b € Bi(H)sa, decompose b = by — b_, where b+ > 0; see Proposition 
A.24 (this remains valid in general Hilbert spaces). We then define 


Ki (b) = Ki (by) — Ki(b_). (5.74) 


To show that this makes K, linear on all of Bi (H)sa, suppose b = b’, — b!_ 
with b!, > 0. Then b', +b_ = b, +1, and since each term is positive, 


Ki (0), +b_) = Ki (b),) + Ki(b-) = K(b, +8.) = Ki.) + Ki (B.), 


by the previous step. Hence K;(b/,) — Ki(b_) = Ki(b+) — Ki(b_), so that 
(5.74) is actually independent of the choice of the decomposition of b as long 
as the operators are positive. Hence for a,b € B,(H)., we may compute 


Ki (a+b) = Ki (ay +b4 — (a_+b_)) =Ki (ay +54) — Ki(a_ +b_) 
= Ki (a+) + Ki (by) — Ki(a_) — Ki (b_-) = Kia) + Ki (8), 


since a, +b, and a_ + b_ are both positive. 


The key point in verifying isometry of K; is the property |b] = by +b_, which 
follows from (A.76) or Theorem B.94. Using this property, we have 


[IK (6)||a = Te(|Kib|) = Tr(|Ki (b+) — Ki(b-)|) = Tr (Ki (b+) + Ki(b-)) 
= Tr(b, +b_) =Tr(|by —b_|) = Tr([b)) = [lbh 


. For any two unit vectors y,@ in H we have the formula 
lew —eo|l1 = 2,/1—Tr(eweg), (5.75) 


which can easily be proved by a calculation with 2 x 2 matrices (since everything 
takes place is the two-dimensional subspace spanned by y and @, expect when 
@ = zw, z€ T, in which case (5.75) reads 0 = 0 and hence is true also). Since K; 
is linear as well as isometric with respect to the trace-norm, we have 


I|Ki (ew) — Ki (eq) |I1 = |[Ki (ey — ee) |i = lew — eel, 


and hence, by (5.75), Tr (Ki (ew) Ki (e@)) = Tr (eyeg). Eq. (5.70) then gives (5.8). 
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We move on to (5.71). The main concern is that this expression be well defined, 
since in case some eigenvalue A > 0 of p is degenerate (necessarily with finite mul- 
tiplicity, even in infinite dimension, since p is compact), the basis of the eigenspace 
H,, that takes part in the sum );; Ae, is far from unique. This is settled as follows: 


Lemma 5.16. Let W: Y\(H) — Y1 (A) be a bijection that satisfies (5.8), let L CH 
be a (finite-dimensional) subspace, and let (v;) and (v/) be bases of L. Then 


LWeer,) = Weer). (5.76) 


Proof. As usual, for projections e and f on H we write e < f iff eH C fH. From 
(B.212) and (B.214) we have );|(v;,y)|? <1 for any unit vector y € H, with 
equality iff y € L. In other words, ey < ey iff );Tr (ev;ey) = |. Furthermore, by 
(5.8) the images W(e,,) remain orthogonal; hence )’; W(ep,) is a projection, and 
e< Yj; W(eo,) iff Lj Tr (W(e, )e) = 1. By (5.8), this condition is satisfied for e = 
W(éy,), 80 that Wey’) < 2; W(ev,) for each j. Since also the projections W(e,) 
are orthogonal, this gives )}; W(e},) < Lj; W(eo, ). Interchanging the roles of the two 
bases gives the converse, yielding (5.76). 


Finally, to prove bijectivity of the correspondence K + W, we need the property 


K (Eas) =VAK(er,), (5.77) 


since this implies that K is determined by its action on #|(H) C F(A). In finite 
dimension this follows from convexity of K, and we are done. In infinite dimension, 
we in addition need continuity of K, as well as convergence of the sum J; Ajev, 
not only in the operator norm (as follows from the spectral theorem for self-adjoint 
compact operators), but also in the trace norm: for finite n,m, 


on m m 
IV Aienjlli < YV lalllenls = Yo Ai, 
i=n ian 


i=n 


since lleu, ll = 1. Because YA; = 1, the above expression vanishes as n,m — ©, 
whence Py = Y_| Aiev, is a Cauchy sequence in B;(H), which by completeness of 
the latter converges (to an element of Z(H), as one easily verifies). 

The proof of continuity is completed by noting that K is continuous with respect 
to the trace norm, for it is isometric and hence bounded (see step 2 above). 


It is enlightening to give a rather more conceptual proof that Kj y, (7) satisfies (5.8), 
which is based on a result to be used more often in the future. In what follows, for 
any convex set C, the notation A,(K) stands for the real vector space of bounded 
affine functions f : C > R, that is, bounded functions satisfying 


f(tx+(1—t)y) =tf(x) +(1—-t) f(y), x,y © Cyt € (0,1). (5.78) 


It is easily checked that A,(K) with the supremum-norm is a real Banach space. 
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Proposition 5.17. For any Hilbert space H we have an isometric isomorphism 


Ap(A(H)) = B(A)sa, (5.79) 
fea (5.80) 
f(p) = Tr(pa), (5.81) 


which preserves the unit (i.e., 1 gi) <+ 14) as well as the order (i.e, f > 0 iffa = 0). 


Note that under the identification Z(H) ~ S,(B(H)) (where in finite dimension the 
normal state space S,(B(H)) simply coincides with the state space S(B(H))), where 
p © @as in (2.33), i.e., @(a) = Tr(pa), the above isomorphism simply reads 


Ap(Sn(B(H))) = B(A) sa, (5.82) 
G + a; (5.83) 
a(@) = (a). (5.84) 


Proof. \t is clear that for each a € B(H)sq the function f : p ++ Tr(pa) (or, equiv- 
alently, d: @ ++ @(a)) is affine as well as real-valued, and is bounded by (A.100) 
(supplemented, if dim(H) = «, by Lemma B.142), noting that ||p||; = 1 for p € 
Q(H), and in fact (B.483) yields the equality || f||.. = ||a|| (or ||4||.. = |Jall). 
Conversely, f € A,(A(H)) defines a function Q: H > R by 
Q(0) = 0; (5.85) 
OW) = IIII’Flewsiyi) (Y A). (5.86) 


This function is clearly bounded on the unit ball of H, as in 


lO(y)| < | Fllll yl. (5.87) 


To check that Q in fact defines a quadratic form on H, we verify the properties (A.8) 
- (A.9). The first is trivial. The second follows from the easily verified identity 


te view +(1—ft)e vw = sew + (1—s)e , (5.88) 


v+w]| llv—w]] [wl] 


where v,w £0, v  w, and the coefficients s,t are given by 


llv+ wll? 
= a? (5.89) 
2([lv||? + [lwll?) 
lvl? 
s= . (5.90) 
Ilv||? + []w Il? 


The affine property (5.78) then immediately yields (A.9). According to Proposition 
B.79, we obtain a unique operator a € B(H),, such that O(W) = (w,ay), ie., 


(y.ay) = fley), ver, |lyll = 1. (5.91) 
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Since also (y,ay) = Tr (eya), we have established (5.81) for each p = ey, where 
y €H,||y|| = 1. To extend this result to general density operators p = Y;;Ajey,, we 
use (A.100) as well as convergence of the above sum in the trace norm ||- ||1, cf. the 
proof of Lemma 5.16; the details are analogous to the proof of Theorem B.146. 


Proposition 5.18. For any unit vectors y,@ © H we have 


Tr(eyeg) =inf{f(ey) | f eA AH) 0<f<1fleg)=1}. 6.92) 


The virtue of this formula is that the expression on the left-hand side, which defines 
the transition probabilities on 0.9(H) = Y1(H), is intrinsically given by the con- 
vex structure of Z(H). Consequently, any affine bijection of this convex set (which 
already preserves the boundary) must preserve these probabilities. 


Proof. By the previous proposition, eq. (5.92) is equivalent to 
Tr (€yeg) = inf{(w,ayw) | a € B(H) a,0 <a < 1,(9,a@) = 1}. (5.93) 


Since Tr(eyeg) = (W,egW), we are ready if we can show that the infimum is 
reached at a = ég. Therefore, we prove that for any a as specified we must have 
(w,ay) > Tr(eyeg) =|(@, y)|*. To do so, we are going to find a contradiction if 


(way) <Tr(eyeg), (5.94) 


for some such a. Indeed, (9,a@) = 1 with ||a|| < 1 (which follows from 0 <a < 1) 
and ||@|| = 1 imply, by Cauchy—Schwarz, that ag = @. Since a* = a (by positivity of 
a), we also have a: (C-~)+ — (C-@)*, so we may write a = eg +a’, with a’p =0 
and a’ mapping (C- @)* to itself. Then a > 0 implies a’ > 0. If (5.94) holds, then 
(w,a'w) <0, which contradicts positivity of a’ (and hence of a). 


We now turn to the equivalence between Jordan’s Theorem and Kadison’s Theorem. 


Proposition 5.19. There is an isomorphism of groups between: 


e The group of affine bijections K: Y(H) + Y(H); 
e The group of Jordan automorphisms J: B(H)s, + B(H) sa, 


such that for any a € B(H)sa one has 
Tr(K(p)a) = Tr(pJ(a)) (p € A(H)). (5.95) 
This immediately follows from the following lemma (of independent interest): 


Lemma 5.20. /. There is a bijective correspondence between: 


e affine bijections K: 9(H) > F(A); 
e unital positive (i.e. order-preserving) linear bijections 0: B(H) sa + B(H) sa, 
such that for any a € B(H)s, one has (5.95). 


2. A map a : B(H) — B(H) is a unital positive linear bijection iff it is a Jordan 
automorphism. 
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Proof. 1. An affine bijection K: (H) — Y(H) induces an isomorphism 


K* :A,(A(H)) > Ap(D(A)); (5.96) 
fr fok, (5.97) 


which is evidently unital, positive, and isometric. Consequently, by Proposition 
5.17, K* corresponds to some isomorphism @ : B(H)., > B(H) a, which neces- 
sarily shares the properties of being unital, positive, and isometric; this follows 
abstractly from the proposition, but may also be verified directly from (5.95). 
Conversely, such a map @ yields a map K directly by (5.95); to see this, we 
identify Z(H) with the normal state space of B(H) through p © @, as usual, cf. 
(2.33), and note that K@ is the state defined by (K@)(a) = @(a(a)), or briefly 
Ko = wo. This is often written as K = a*, and for future reference we write 


*@(a) = @(a(a)). (5.98) 
2. The nontrivial direction of the proof (i.e. positive etc. = Jordan) is based on a 
number of facts from operator theory: 


a. Unital positive linear maps maps on B(H)sq preserve Y(H), cf. (2.164). 

b. Any two projections e and f are orthogonal (ef = 0) iff e+ f < ly (easy). 

c. Any a € B(A)<q is anorm-limit of finite sums of the kind Y;; A;e;, where A; € R 
and the e; are mutually orthogonal projections (this follows from the spectral 
theorem for bounded self-adjoint operators in the form of Theorem B.104) 

d. Any unital positive linear map & : B(H)s, > B(H)sa is continuous. Since 


-ljal| 1 Sa<—llal|-1a (@€ BU) se), (5.99) 
by (C.83), applying the positive map a and using a@(14) = ly yields 
—llall- 1 < @(a) < —llall - 1x. 


This is possible only if ||a@(a)|| < |la|], and hence @ is continuous with norm 
bounded by ||@|| < 1. In fact, since a is unital we have ||q|| = 1. 


Therefore, any unital positive linear map a@ preserves orthogonality of projec- 
tions, so if a=); Ae; (finite sum), then 


a(a’) =a (Ew = LAr ale) = Ye AiAjor(e)a(e;) = a(a)’, (5.100) 


since eje; = djje; and by the above comment also a@(e;)a(e;) = d);a(e;). By 
continuity of a, this property extends to arbitrary a € B(H)sa. Finally, since 


aob=}((a+b)’ —a’ —b’), (5.101) 


preserving squares as in (5.100) implies preserving the Jordan product o. 
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We now turn to the equivalence between Ludwig symmetries and Jordan ones. 
Proposition 5.21. There is an isomorphism of groups between: 


e The group of affine order isomorphism L: &(H) > &(H); 
e The group of Jordan automorphisms J: BUH) 5, + B(A) sa. 


Proof. Since L is an order isomorphism, it satisfies L(0) = 0 (as well as L(1q7) = 
1,,), since 0 is the bottom element of &(H) as a poset (and 1 ,y is its the top element). 
As in the proof of Lemma 5.11, one shows that this property plus convexity implies 
L(ta) =tL(a) and L(a+b) =L(a) +L(b) whenever defined. Defining J by 


J(0) = 0; (5.102) 
J(a) =s-L(a/s) (a> 0,5 > |lall); (5.103) 
J(a) = —J(-a) (a <0), (5.104) 


where a > 0 means a > 0 and a £0, and a < 0 means —a > 0 and a £ 0, once 
again the reasoning near the end of the proof of Lemma 5.11 shows that J is linear; 
it is a untital order-preserving bijection by construction. Hence J is a Jordan auto- 
morphism by Lemma 5.20.2 Of course, instead of (5.104) one could equivalently 
have defined J on general a € B(H)sa by J(a) = J(a,) — J(a_), using the (by now 
hopefully familiar) decomposition a = a, — a_ with az > 0 and a;,a_ =0. 
Conversely, once again using Lemma 5.20.2, a Jordan automorphisms (5.11) pre- 
serves order as well as the unit, so that the inequality 0 < a < ly characterizing 
a € &(H) is preserved, i.e., 0 < J(a) < ly. Thus J preserves &(H), where it pre- 
serves order. Convexity is obvious, since L = J) (7, comes from a linear map. 


The equivalence between Jordan’s Theorem and von Neumann’s Theorem (provided 
dim(H) > 3) hinges on the following corollary of Gleason’s Theorem (cf. §D.1). 


Corollary 5.22. Let dim(H) > 2. Then an isomorphism N of (HA) as an ortho- 
complemented lattice has a unique extension to a linear map 0: B(H) 5, > B(A) sa, 
which is (automatically) invertible, unital, and positive. 


Proof. According to Lemma D.2, N preserves all suprema in Y(H). Since we have 
ye; = Ve; for any family of mutually orthogonal projections and since N by defi- 
nition preserves the orthocomplementation et = 1 — e and hence preserves orthog- 
onality of projections, we may compute 


N (E« =N (ve. =\V/N(ei) =) N(ei). (5.105) 
Consequently, for any normal state @ on B(H), the map e+ @oN(e) is a probability 
measure on Y(H), which by Gleason’s Theorem has a unique linear extension to 
B(H) and hence a fortiori to B(H)s,. We use this in order to define a, as follows. 
First, let a € B(H)<q and suppose a = ¥);A;f; for some finite family (f;) of pro- 
jections (not necessarily orthogonal), and some A; € R. Then ¥);A;N(f;) is inde- 
pendent of the particular decomposition of a that has been chosen, so we may put 
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=YAjN(fj). (5.106) 
J 


To see this, puta = YA, fi, and hence (a) = AyN(fiy), and suppose a’ (a) A 
a(a). By (B.477) there exists a normal state @ such that w(a’(a)) 4 o(a(a)); 
indeed, each element of B,(H) is a linear combination of at most four density op- 
erators, so that each normal linear functional on B(H) is a linear combination of at 
most four normal states. But since @ ON is linear, this implies @o N(a) 4 @o N(a), 
which is a contradiction. Hence a’ (a) = a(a) and accordingly, (5.106) is well de- 
fined. Because it is independent of the decomposition of a into projections, a is 
linear: ifa =) jAjfjanda’ =P Vi fy, thenatal =YjAjfj+ Ly Ai fy, so that 


N(ata') = n (Dae Eas) = DAN fi) +E AYNC (a) +N(a’). 


Similarly, for any t € IR we have 


J 


N(ta) =N (Ean) = LAAiN(fi) =tylAjN(fj) =tN(a). 


We may now extend @ to all of B(H),, by continuity. Indeed, according to the 
spectral theorem in the form (B.326), the set of all operators of the form a = )) Aj fj 
with all f; mutually orthogonal (so that a is given by its spectral resolution) is norm- 
dense in B(H sa. Applying (5.106), and noting that ||a|| = sup ; |A;|, we may estimate 


||@(a)|| = Idi N(Fi)II S sup{|A; PUN fil < lal, 


since the N(f;) are mutually orthogonal and hence sum to some projection, which 
has norm | (unless a = 0). For general a € B(H)sa, we may therefore define N by 
N(a) = lim, N(a,), where each ay is of the above (spectral) form and ||a, —a|| — 0. 

To prove that @ is positive, we show that (a) > 0 whenever a > 0. As in the pre- 
ceding step, initially suppose that a = ))A;f; has a finite spectral resolution. Then 
a > Oiff A; > 0 for each j, and hence a(a) > 0 by (5.106), since by orthogonality 
of the N(f;) this equation states the spectral resolution of a(a). Now if a, > 0 and 
ad, — a (in norm), then (W,an,yw) — (W,ay), which must remain positive, so that 
a > 0. Hence positivity of @ on all of B(H),, follows by continuity. 

Finally, a inherits invertibility from N, and it is unital by (5.105), taking e; = 
|v;) (v;| for some basis (v;) of H (or using the fact that it preserves T = 11). 


Subsequently, we use Lemma 5.20 to further extend a@ by complex linearity to a 
Jordan isomorphism of B(H); see Definition 5.1. 

Finally, the equivalence between weak Jordan symmetries and Bohr symmetries 
follows from Hamhalter’s Theorem 9.4, whereas Theorem 9.7 strengthens this to an 
equivalence between Jordan symmetries and Bohr symmetries. The proof of these 
theorems does not seem to simplify in the special case at hand, i.e. A = B(H). 
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5.4 Proof of Jordan’s Theorem 


In view of the equivalence between the six parts of Theorem 5.4, we only need to 
prove one of them. In the literature, one only finds proofs of Jordan’s Theorem and 
of Wigner’s Theorem, and we present each of these (surprisingly but instructively, 
these proofs look completely different). We start with Jordan’s Theorem: 


Theorem 5.23. Any Jordan automorphism Jc of B(H) is given by either 

Jc(a) = a, (a) = ua", (5.107) 
where u is unitary (and is determined by Jc up to a phase), or by 

Jc(a) = a) (a) =ua*u", (5.108) 


where u is anti-unitary (and is determined by Jc up to a phase, too). 
The difficult part of the proof is Theorem C.175, which implies: 


Proposition 5.24. A Jordan automorphism o of B(H) is either an automorphism or 
an anti-automorphism. 


Recall that an automorphism of B(H) is a linear bijection a : B(H) + B(H) that sat- 
isfies a(a*) = a(a)* and &(ab) = a(a)a(b); an anti-automorphism, on the other 
hand, satisfies the first property whilst the latter is replaced by a(ab) = a(b)a(a). 
Clearly, both automorphisms and anti-automorphisms are Jordan automorphisms. 
Granting this result, we may deal with the two cases separately. 


Proposition 5.25. Any automorphism a : B(H) + B(H) takes the form & = Oy, see 
(5.107), where u: H — H is unitary, uniquely determined by & up to a phase. 


The proof uses the following lemmas. The first follows from Theorem C.62.4. 


Lemma 5.26. /f « : B(H) > B(H) is an automorphism and a € B(H), then 
||@(@)|| = llal)- (5.109) 


Lemma 5.27. [f a : B(H) — B(H) is an automorphism and e € B(H) is a one- 
dimensional projection, then so is a(e). 


Proof. It should be obvious that automorphisms @ preserve projections e (whose 
defining properties are e? = e* = e). Furthermore, @ preserves order, i.e., if a > 0 
(in that, as always, (w,aw) > 0 for each y € H, or, equivalently, a = b*b), then 
a(a) > 0 (this is clear from the second way of expressing positivity). Consequently, 
if a <b (in that b—a > 0), then a(a) < a(b). We notice that if we define e < f iff 
eH C fH, then e < f iff e < f as self-adjoint operators (in that (y,ew) < (w, fy) 
for each y € H); see Proposition C.170. With respect to the ordering < the one- 
dimensional projections e are atomic, in the sense that 0 < e (but e £ 0) and if 0 < 
f <e, then either f = 0 or f =e. Now automorphisms of the projection lattice B(H) 
restrict to isomorphisms of “(H), which preserve atoms (as these are intrinsically 
defined by the partial order). 
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We are now ready for the (constructive!) proof of Proposition 5.25. 


Proof. For some fixed unit vector 7 € H, take the corresponding one-dimensional 
projection ey and define a new unit vector @ (up to a phase) by 


vg: = (ey). (5.110) 


Now any yw € H may be written as y = aq, for some a € B(H). Attempt to define 
an operator u by uw = a(a)y, ie., 


uagp = a(a)x. (5.111) 


This looks dangerously ill-defined, since many different operators a may give rise 
to the same y. Fortunately, we may compute 


lla ll = llaeo Pll = \laeq lla) = llo-(4eq)|la¢H) 
= |@(a)a(eg) lau) = le(@)ex|laa) = le(@) 2 lla 
= |uag|la, 
so that if ap = bg, then a(a)x = a(b)x and hence u is well defined. By this 
computation u is also isometric and since it is clearly surjective, it is unitary. The 


property @(a) = uau* is equivalent to ua = a@(a)u, which in turn is equivalent to 
uab@ = a(a)ubo@ for any b € B(H), which by definition of u is the same as 


a(ab)y = a(a)a(b)x. (5.112) 


But this holds by virtue of @ being an automorphism. Finally, all arbitrariness in u 
lies in the lack of uniqueness of @ given its definition (5.110). 


Proposition 5.28. Any antiautomorphism a : B(H) + B(H) takes the form & = Oy, 
cf. (5.108), where u: H — H is anti-unitary, uniquely determined by a up to a phase. 


Proof. Pick an arbitrary anti-unitary operator J : H — H and define 


B: B(A) > B(A); 
B(a) =Ja*J*. (5.113) 


Then oo B is an automorphism, to which Proposition 5.25 applies, so that 
ao B(a) = tai’, (5.114) 
for some unitary 7. Hence 
a(a) = a(BoB-'(a)) = aoB(I*a*J) =i aS, 


so that &(a) = ua*u* with u = aJ*. 
The precise lack of uniqueness of u is inherited from the unitary case. 
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5.5 Proof of Wigner’s Theorem 


We recall Wigner’s Theorem, i.c. Theorem 5.4.1: 


Theorem 5.29. Each bijection W : Y\(H) + A\(A) that satisfies 
Tr(W(e)W(f)) =Tr(ef), (e,f € Ai(A)), (5.115) 


is given by W(e) = ueu* = a,(e), where the operator u is either unitary or anti- 
unitary, and is uniquely determined by W up to a phase. 


The problem is to lift a given map W: | (H) + #1 (H) that satisfies (5.115) to 
either a unitary or an anti-unitary map u: H — H such that 


Wey) = euy = ueyu'. (5.116) 


Suppose W (ey) = ey’. Since ezy = ey for any z € T, and likewise for ey, this means 
that uy = zw’ for some z € T; the problem is to choose the z’s coherently all over 
the unit sphere of H. There are many proofs in the literature, of which the following 
one—partly based on an earlier proof by Bargmann (1964)—has the advantage of 
making at least the construction of u explicit (at the cost of opaque proofs of some 
crucial lemma’s). We assume dim(H) > 2, since H = C” has already been covered. 

Fix unit vectors y € H and wy’ € W(ey)H; clearly, ’ is unique up to multiplica- 
tion by z € T, whose choice turns out to completely determine u (i.e., the ambiguity 
in y’ is the only one in the entire construction). For a modest start, we put 


uy =i. (5.117) 


Lemma 5.30. Jf V C H is a k-dimensional subspace (where k < °), then there is a 
unique k-dimensional linear subspace V' C H with the following property: 


For all unit vectors y € H, we have w € V iffW(ey)H CV’. 


Proof. Pick a basis (01,...,0,¢) of V and find unit vectors vj € H such that v! € 
W(e,,)H, i=1,...,k. Then, using (5.115) we compute 

(vj, 09) [? = Tr (yey) = Tr(W (er, W(€v,)) = Tr (ev,e0,) = (0, 0,)[? = 5, 
so that the vectors (vj,...,0;) form an orthonormal set and hence form a basis 
of their linear span V’. Now, as mentioned below (B.214), we have w € V iff 
LE, | (v;, vy)? = 1 and similarly y’ € V’ iff P<, |(v/, w’)|? = 1. Since W preserves 
transition probabilities, a computation similar to one just given yields 


k k 
Yow? = Vow), (5.118) 
i=l i=l 


so that both sides do or do not equal unity, and hence y € V iff yw’ EV’. 


148 5 Symmetry in quantum mechanics 


Wigner’s Theorem for H = C? (i.e. Theorem 5.7) implies: 
Lemma 5.31. /fV and V' are related as in Lemma 5.30, and 


dim(V) = dim(V’) = 2, (5.119) 
then there is a unitary or anti-unitary operator uy : V — V' such that 
W(e) = uyeuy, (5.120) 


for any one-dimensional projection e € P\(V), where P\(V) C Ai (A) consists of 
alle € Y\(H) with eH CV. Moreover, uy is unique up to a phase. 


Proof. A choice of basis for both V and V’ gives unitary isomorphisms u : V +c? 
and u’ : V’ — C2, which jointly induce a map 


W =u'Wu! : A\(C*) > A(C’). (5.121) 


This maps satisfies the hypotheses of Wigner’s Theorem in d = 2, and so it is (anti-) 
unitarily induced as W! = a, where v : C* —> C? is (anti-) unitary. Then the operator 
uy = (u')~!vu does the job; its lack of uniqueness stems entirely from v. 


Lemma 5.32. Given a Wigner symmetry W, the ensuing operator uy is either uni- 
tary or anti-unitary for all two-dimensional subspaces V C H (simultaneously). 


Proof. We first design a “unitarity test” for W. Define a function 


T : A\(H) x P\(H) x P(H) > C; (5.122) 
T(e,f,g) =Tr(efg), (5.123) 
T (Cy, Cw sys) = (Wi, Wo) (Wr, Ws) (Ws, 1). (5.124) 


Let V C H be two-dimensional and pick an orthonormal basis (01, V2). Define 
M1 = V1, Xo = (V1 — dy) /V2, 3 = (V1 — it)/V2. (5.125) 
A simple computation then shows that 
T (ey, ,€y5€y3) = 4(1 +i). (5.126) 
It follows from (5.124) that for u unitary and v anti-unitary, we have 


T (uy, Cuys Cuy;) = T (Cy, ,eyseys)3 (5.127) 
T (Crys eryns@vys) = T (ey, yey): (5.128) 


Eq. (5.126) implies that if W : V — V’ is (anti-) unitarily implemented, we have 


T (W(ex, ), Wey, ),W(ex;)) = T (uz, Cuz €uy,) = ¢(1 £1), (5.129) 


5.5 Proof of Wigner’s Theorem 149 


with a plus sign if u is unitary and a minus sign if wu is anti-unitary. Now take a 
second pair (V,V’) as above, and pick a basis (6,02) of V, with associated vectors 
(%1,%2,H3), as in (5.125). Suppose u: V — V’ implementing W is unitary, whereas 
a: V —V’ implementing W is anti-unitary. It then follows from (5.129) that 


T (Wey, ), Wey, ),W(ex;)) = T (uy, suzy sCuy;) = (1 +2); (5.130) 
T (W(ex,), W(eq), W(eg3)) = T (Caz, iz Cag) = 4 (1 — 2). (5.131) 


In view of (C.637), the following expression defies a metric d on Y;(H): 


d(ey,eo) = ||Oy — | = lley — eg||1 =2\/1- l(9.W)|?, (5.132) 


with respect to which both W and T are continuous (the latter with respect to the 
product metric on Y; (H)%, of course). Let t+ (v4 (t), V2(t)) be a continuous path of 
orthonormal vectors (i.e., in H x H), with associated vectors (71 (t), ¥2(t), 73(t)), as 
in (5.125). Then the function f(t) = T(W(41(t)), W(%2(t)), W(%3(t))) is continu- 
ous, and by (5.129) it can only take the values ;(1+7). Hence f(r) must be constant. 
However, taking a path such that (v1(0),v2(0)) = (v1, 02) and (v1 (1), v2(1)) = 
(01,02), gives f(0) = 4(1 +i) and f(1) = 4 (1—{), which is a contradiction. 


Lemma 5.33. Wigner’s Theorem holds for three-dimensional Hilbert spaces. 


Proof, Let (v1, 02,03) be some basis of of H (like the usual basis of H = C+). We 
first show that if W is the identity if restricted to both span(v1, v2) and span(v1, U3), 
then W is the identity on H altogether. To this end, take wy = )';c;0;, initially with 
c1 € R\{0}. Take a unit vector y’ € Wey), with y = Y;c/v;. By the first assump- 
tion on W we have |(v, w’)| = |(v, w)| for any unit vector v € span(v1, V2). Taking 


V=V1, V=, V=(Vi+2)/V2, vV=(M+id2)/V2, (5.133) 
gives the equations 
ei]=lerl, lex] =leal, ley +e4] = ler tea), ley icy] =|e1 ica], (5.134) 


respectively. By a choice of phase we may and will assume c = c;, in which case 
the only solution is cy = c, (geometrically, the solution c’, lies in the intersection 
of three different circles in the complex plane, which is either empty or consists 
of a single point). Similarly, the second assumption on W gives c3 = c’, whence 
yw’ = w. The case c = 0 may be settled by a straightforward limit argument, since 
inner products (and hence their absolute values) are continuous on H x H. 

Given a Wigner symmetry W : Y;(H) + Y(H), we now construct u as follows. 


1. Fix a basis (v1, 02, D3) with “image” (v}, 05, v4) under W, i.e, W(ev,) = ey. 

2. The unitarity test in the proof of Lemma 5.32 settles if the operators should be 
chosen to be unitary or anti-unitary; for simplicity we assume the unitary case. 

3. Define a unitary uw; : H > H by u,v! = 0; for i = 1,2,3, and subsequently de- 
fine W; = a,, 0 W, which (being the composition of two Wigner symmetries) 
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is a Wigner symmetry. Clearly, W1(ev,) = ev, (i = 1,2,3), so that W; maps 
PY (H(12)) to itself, where H(;2) = span(v1, V2). Hence Lemma 5.31 gives a uni- 
tary map @ : H(42) + H(12) such that the restriction of W; to H(12) is Qi, 

4. Define a unitary u2:H > H by u2 = a on A1(12) and 4203 = V3, followed by the 
Wigner symmetry W2 = 0%, 0 W,. By construction, W2(ey,) = ev, for i= 1,2,3) 
(W> is even the identity on Y)(H(12))), so that W2 maps 7) (H13)) to itself, 
where H(13) = span(v1, 03). Hence the restriction of W2 to A1(13) is implemented 
by aunitary i : H(;3) + H(13), whose phase may be fixed by requiring #0) = 0). 

5. Similarly to u2, we define u3:H — H by u3 = i on H(13) and u3V2 = U2, SO 
that w3 is the identity on H(;2). Of course, we now define a Wigner symmetry 


W3 = Oy; 0W2 = Oy; 0 Alyy 0 Oy, 0 W, (5.135) 


which by construction is the identity on both Y}(H(;2)) and Y(H(13)), and so 
by the first part of the proof it must be the identity on all of A,(H). Hence 


= = 1-1-1 
NM Eso avo = Oy (u=u, uy u°). 


Lemma 5.34. As in Lemma 5.30, if dim(V) = dim(V’) = 3, then there is a unitary 
or anti-unitary operator uy :V — V' such that W(e) = uyeuy, for any e € Py (V), 


Proof. Given Lemma 5.33, the proof is practically the same as for Lemma 5.31. 


We now finish the proof of Wigner’s Theorem. We assume that the outcome 
of Lemma 5.32 is that each uy is unitary; the anti-unitary case requires obvious 
modifications of the argument below. The first step is, of course, to define u(A y) = 
Auy, A € C (so this would have been Auy in the anti-unitary case). Let @ € H be 
linearly independent of y and consider the two-dimensional space V spanned by yw 
and @. Define u(@) = uy @. With (5.117), this defines u on all of H. To prove that 
u is linear, take @; and @» linearly independent of each other and of y, so that the 
linear span V3 of W, @ 1, and @ is three-dimensional. Let V; be the two-dimensional 
linear span of y and 9;, i = 1,2. Then u@; = uy,@;, where the phase of uy, is fixed 
by (5.117). Let w : V3 — V4 be the unitary that implements W according to Lemma 
5.33.2, with phase determined by (5.117). Since uy, and uy, and w are unique up 
to a phase and this phase has been fixed for each in the same way, we must have 
uy, = Wy, and uy, = WV Finally, we have Vj2 spanned by y and @; + @2, and by 
the same token, uy,, = Wi Now w is unitary and hence linear, so 


U(Q + Pr) = Uy. (Pi + G2) = w(G1 + G2) = w(1) + w(2) 
= uy, (1) +uy,(@2) = u(G1) +u(@2), 


since this is how u was defined. Since each uy is unitary, so is u, and similarly it is 
easy to verify that u implements W, because each uy does so. 
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5.6 Some abstract representation theory 


Since all symmetries we have considered (named after Wigner, Kadison, Jordan, 
Ludwig, von Neumann, and Bohr) are implemented by either unitary or anti-unitary 
operators, which are determined (by the given symmetry) only up to a phase z € T, 
the quantum-mechanical symmetry group Y” of a Hilbert space H is given by 


g — (U(H)UU,(H))/T, (5.136) 


where U(H) is the group of unitary operators on H, and U,(#) is the set of anti- 
unitary operators on H; the latter is not a group (since the product of two anti- 
unitaries is unitary) but their union is. Furthermore, T is identified with the normal 
subgroup T =T-1y = {z-1y|z€ T} of U(H) UU, (A) (and also of U (H)) consist- 
ing of multiples of the unit operators by a phase; thus the quotient Y” is a group. 

The fact that Y” rather than U(H) is the symmetry group of quantum mechanics 
has profound consequences (one of which is our very existence), which we will 
study from 85.10 onwards. However, this material relies on the theory of “ordinary” 
(i.e., non-projective) unitary representations, which we therefore review first. 

Namely, let G be a group. In mathematics, the natural kind of action of G on a 
Hilbert space H is a unitary representation, i.e., a homomorphism 


u:G—U(H), (5.137) 


so that u(x)~! = u(x7!) = u(x)* and u(x)u(y) = u(xy), which imply u(e) = 1y. 

As to the possible continuity properties of unitary representations in case that 
G is a topological group (i.e., a group G that is also a topological space, such that 
group multiplication G x G+> G and inverse G — G are continuous), one should 
equip U(H) with the strong operator topology (as opposed to the norm topology). 


Proposition 5.35. [fu : x ++ u(x) is a unitary representation of some locally compact 
group G on a Hilbert space H, then the following conditions are equivalent: 


1. The map G x H — H, (x, w) > u(x)y, is continuous; 
2. The map G + U(H), x ++ u(x), is continuous in the strong topology on U(H). 


Proof. Strong continuity means that if x, — x in G, then for each y € H we have 
|| (u(xq) —u(x)) || — 0. This is clearly implied by the first kind of continuity, giving 
1 = 2, so let us prove the nontrivial converse. Suppose x, — x and YW, > Y; since 
G is locally compact, x has a compact neighborhood K and we may assume that 
each x, € K. If wis strongly continuous, then for any @ € H the set {u(y),y € K} 
is compact in H and hence bounded. The Banach-Steinhaus Theorem B.78 gives 
boundedness of the corresponding operator norms, that is, {||u(y)||,y € K} < Cx for 
some Cx > 0. We now estimate 


lara) Yu — ux) Yl S lara) Yu — ura) Wl + [ua — HX) VI. 


The first term vanishes as Wy, — y since it is bounded by Cx|| Wu — y||, whereas the 
second vanishes as x, — x by the (assumed) strong continuity of u. 
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Since the first kind of continuity is the usual one for group actions, this justifies the 
choice of strong continuity as the natural one for unitary representations (to which 
a pragmatic point may be added: norm continuity is quite rare for unitary represen- 
tations on infinite-dimensional Hilbert spaces). Things further simplify under mild 
restrictions on G and H, which are satisfied in all examples of physical interest. 


Proposition 5.36. If H is separable and G is second countable locally compact 
(sclc), then each of the two continuity conditions in Proposition 5.35 is in turn equiv- 
alent to weak measurability of u, in that for each 0, € H the function 


xt (p,u(x)W) 


from G to C is (Borel) measurable. 


Proof. This spectacular result is due to von Neumann, who more generally proved 
that a measurable homomorphism between sclc groups is continuous. This implies 
the claim: first, if H is separable, then the group U(H) is sclc in its weak operator 
topology, so that if the map G > U(H), x +> u(x) is weakly measurable, then it 
is continuous in the weak topology on U(H). Second, for any Hilbert space, weak 
(operator) continuity of a unitary representation implies strong continuity (so that, 
given the trivial converse, weak and strong continuity of unitary group representa- 
tions are equivalent). We only prove this last claim: for x,y € G, we compute 


\| (u(y) — u(x) yl] = lux) yl? + lu) WII? — Uo) y.ud)y) — uo) yu) 
= 2\||? — (yeux !y)w) — (v.uor x) y), 


Weak continuity obviously implies that the function x > (w,u(x)w) is continuous 
at the identity e € G, so if y =x, — x, then ||(u(x,) — u(x)) w|| > 0. 


In view of this, it is hardly a restriction for a unitary representation of a locally com- 
pact group on a Hilbert space to be continuous in the sense of Proposition 5.35, so 
we always assume this in what follows. Furthermore, any group we consider is lo- 
cally compact, so this will be a standing assumption, too. An important consequence 
of this assumption is the existence of a translation-invariant measure on G. 


Theorem 5.37. Each locally compact group G has a canonical nonzero (outer reg- 
ular Borel) measure 1, called Haar measure, which is left-invariant in that 


[4ute)tss(s) = [duce £0), (5.138) 
for each f € C.(G) and y € G, where the left translation Ly of f by y is defined by 


LG —fy on. (5.139) 


This measure is unique up to scalar multiplication. Moreover, if G is compact, then: 


1. pW is finite and hence can be normalized to a probability measure, i.e., 


u(G) =1. (5.140) 
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2. Ut is also right-invariant in that 


[duce ) Ry f (x) =f du(x (5.141) 
where the right translation R, of f by y € G is defined by 


Ry f (x) = f(xy). (5.142) 


3. p is invariant under inversion, in that 


i du(x )=f du(x (5.143) 


Existence is due to Haar and uniqueness was first proved by von Neumann. One 
often writes dx = du(x) for Haar measure. Here are some examples: 


e For G=R", Haar measure equals Lebesgue measure [ly (up to a constant); eqs. 
(5.139) and (5.141) state the familiar translation invariance of Ly. 
e For G=T, we have 


[aworo=% [a0 re) (5.144) 
i a ' 
e For G=GL,(R) with X = (x), we have 
x= Il dx;;| det(X)|™, (5.145) 
‘j=l 


which for G = SL,(R) of course simplifies to du(X) = [];,; d%xi;. 


Definition 5.38. A unitary representation u of a group G on a Hilbert space H is 
irreducible if the only closed subspaces K of H that are stable under u(G) (in the 
sense that if w € K, then u(x)w € K for all x € G) are either K = H or K = {0}. 


We will often need two important results about irreducibility. The first is Schur’s 
Lemma, in which the commutant S’ of some subset S C B(H) is defined by 


= {a € B(H) |ab=baVb' € S}. (5.146) 

Lemma 5.39. A unitary representation u of a group G is irreducible iff 
u(G)' =C-1, (5.147) 

ie, if au(x) = u(x)a for each x € G implies a=1-1y for some 0 € C. 


This follows from Theorem C.90, of which the above lemma is a special case: take 
A =u(G)" = (u(G)’)’. The second is part of the Peter-Weyl Theorem. 


Theorem 5.40. [rreducible representations of compact groups are finite-dimensional. 
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Proof. We first reduce the situation to the unitary case: if (-,-,)’ is the given inner 
product on H, we define a new inner product (-,-,) by averaging with respect to 
Haar measure dx = d(x), ie., 


(y.9) = is dx (u(x) W,u(x)@). (5.148) 


Using (5.141), it is easy to verify that this new inner product makes wu unitary. 
So let u: G— u(H) be an irreducible unitary representation. For each unit vector 
@ €H and x € G, we define the following projection and its G-average: 


Cu(x)p = |U(x)) (u(x) QI, (5.149) 
Wo = [ dreune (5.150) 


The Weyl operator (5.150) is initially defined as a quadratic form by 


(wiWe¥a) = f dx (Wi.euio¥e) (5.151) 


The integral exists because the integrand is continuous and bounded, defining a 
bounded quadratic form by the estimate |(W1,Wy W2)| < ||Yi ||| Y2||, where we as- 
sumed (5.140) and used |le,,,)|| = 1, as (5.149) is a nonzero projection. Thus the 
operator Wg may be reconstructed from its matrix elements (5.151), cf. Proposition 
B.79. It is easy to verify that [Wy,u(y)] =0 for each y € G, so that Schur’s Lemma 
yields Wp = Ag: 14 for some Ag € C. Hence (yw, WoW) = Ao|| ||, in other words, 


[4e\v.usye)? = Agliv?. (5.152) 


If we now interchange @ and y and use (5.143) we find A@|| y||? = Ay||g||?, so that, 
taking y to be a unit vector, too, since y and @ are arbitrary we obtain Ag =Ay =A, 
where in fact A > 0, as follows by taking y = @ in (5.152). Finally, take n or- 
thornormal vectors (01,..., Un) in H, so that also (u(x)01,...,u(x) D,) are orthonor- 
mal (since u(x) is unitary), upon which Bessel’s inequality (B.212) gives 


n 
Yi lysule)vil? < |lwll. (5.153) 
i=l 
Integrating both sides over G, taking ||y|| = 1, and using (5.140) gives 
n 
y [ax ly.ulayop)? <1, (5.154) 
i=1/G 


On the other hand, summing (5.152) over i simply yields nA, whence nd < 1, for 
any n < dim(H). Since A > 0 this forces dim(H) < ©. 
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5.7 Representations of Lie groups and Lie algebras 


We now assume that G is a Lie group; as in §3.3, for our purposes we may restrict 
ourselves to linear Lie groups, i.e. closed subgroups of GL, (K) for K = R or C. 
Let u: G > U(H) be a unitary representation of a Lie group G on some Hilbert 
space H (assumed strongly continuous). If H is finite-dimensional, the following 
operation is unproblematic: for A € g (i.e. the Lie algebra of G) we define an operator 


u(A): HH; (5.155) 
d 
u'(A) = Ge) po" (5.156) 
This gives a linear map u’ : g + B(H), which satisfies 


[u'(A),w/(B)] = u'({A, B]); (5.157) 
u/(A)* = —w'(A). (5.158) 


Note that physicists use Planck’s constant # > 0 and like to write 
(A) = ifu' (A), (5.159) 
so that one has the following commutation relations and self-adjointness condition: 


[~(A),2(B)] = ihn([A,B)); (5.160) 
m(A)* = n(A). (5.161) 


If one knows that u’ : g > B(H) comes from u: G + U(A), one conversely has 
u(es) = eA) = el), (5.162) 


More generally, we call a map p : g — B(H) (where H © C” remains finite- 
dimensional, so that p : g > M,,(C)), a skew-adjoint representation of g on H if 


[P(A),p(B)| = p(IA,B}): (5.163) 
p(A)* = —p(A). (5.164) 


The property of irreducibility of such a representation p : g —> B(H) is defined in 
the same way as for groups, namely that the only linear subspaces of H = C” that 
are stable under p(g) are {0} and H. Equivalently, by Schur’s Lemma, p(g) is irre- 
ducible iff the only operators that commute with all 7(A) are multiples of the unit 
operator. If p = uv’ for some unitary representation u(G), it is easy to see that u 
is irreducible iff wu’ is irreducible. In view of this, it is a reasonable strategy to try 
and construct irreducible unitary representations u(G) by starting, as it were, from 
u'(g). More precisely, if p is some (irreducible) skew-adjoint representation of g, 
we may ask if there is a (necessarily irreducible) unitary representation u(G) such 
that p = uv’. Writing exp(p) for u, one would therefore hope that 
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u(e) =e? (eA) =e 4), (5.165) 


as in (5.162). Note that if G is connected, then p duly defines u(x) for each x € G 
through (5.165), since by Lie theory every element x of a connected Lie group is a 
finite product x = exp(A1)---exp(A,) of exponentials of elements (A;,...,An) of g. 

In general, this hope is in vain, since although each operator exp(A) is unitary, the 
representation property u(x)u(y) = u(xy) may fail for global reasons. For example, 
if G = SO(3), then g & R?, with basis (J,J2,J3), as in (3.66). Define an a priori 
linear map p : g —> M2(C) by linear extension of 


P (Jk) = — x10, (5.166) 
where (01, 02,03) are the Pauli matrices (5.42), so that physicists would write 
(Jk) = shox, (5.167) 


cf. (5.159). This is easily checked to give a skew-adjoint representation of g, but it 
does not exponentiate to a unitary representation of SO(3): as already mentioned 
after Proposition 5.46, if u is a unit vector in R°, then a rotation Rg(u) around the 
u-axis by an angle @ € [0,27] is represented by 


u(Ro(u)) = cos(@/2)- 15 +isin(@/2)u-o. (5.168) 


Consequently, u(Rz(u)) = iu-o, so that u(Rz(u))* = —12, although within SO(3) 
one has Rz(u)” = e, the unit of SO(3), so that u(Rz(u))? 4 u(Rz(u)). 

However, p does exponentiate to a representation of SU(2), which happens to 
be the universal covering group of SO(3). This is typical of the general situation, 
which we state without proofs. We first need a refinement of Lie’s Third Theorem: 


Theorem 5.41. Let G be a connected Lie group G with Lie algebra g. There exists 
a simply connected Lie group G, unique up to isomorphism, such that: 

e The Lie algebra of G is g. 

e G=G/D, where D is a discrete normal subgroup of the center of G. 

e D=1(G), i.e. the fundamental group of G, which is therefore abelian. 

For example, for G = SO(3) we have G = SU (2) and D = Zz, cf. Proposition 5.46. 


Theorem 5.42. Let G, and G2 be Lie groups, with Lie algebras g, and gp, respec- 
tively, and suppose that G, is simply connected. Then every Lie algebra homomor- 
phism @ : 91 — 92 comes from a unique Lie group homomorphism ® : G, — G2 
through 9 = ®', where (realizing G, and G2 as matrices) 


@'(X) = ele ig: (5.169) 


Let H be a finite-dimensional Hilbert space, so that B(H) ~ M,(C), where n = 
dim(H), and take U(H) = U,,(C) to be the group of all unitary matrices on C”. The 
Lie algebra u,(C) of Un(C) consists of all skew-adjoint n x n complex matrices. 
Since irreducibility is preserved under the correspondence u(G) © u'(g), we infer: 
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Corollary 5.43. Let G be a simply connected Lie group with Lie algebra g. Any 
finite-dimensional skew-adjoint representation 1: g > u,(C) of g comes from a 
unique unitary representation u(G) through (5.156), in which case we have 


eA) —y(e4) (A€g). (5.170) 


Thus there is a bijective correspondence between finite-dimensional unitary repre- 
sentations of G and finite-dimensional skew-adjoint representations of g. In partic- 
ular, if G is compact, this specializes to a bijective correspondence between unitary 
irreducible representations of G and skew-adjoint irreducible representations of g. 

If G & G/D is connected but not simply connected, then a finite-dimensional 
skew-adjoint representation p : g + B(H) exponentiates to a unitary representation 
u: G+ U(H) iff the representation exp(p) : G + U(A) is trivial on D. 


For example, G = SO(3), the last condition is satisfied for the irreducible repre- 
sentations with integer spins j € N (as well as for j = 0), see 85.8. 

A similar construction is possible when H is infinite-dimensional, except for the 
fact that the derivative in (5.156) may not exist. For example, G = R has its canonical 
regular representation on H = L*(R), defined by u(a) w(x) = w(x—a), in which case 
(5.159) gives some multiple of the momentum operator —ifid /dx. This operator is 
unbounded and hence is not defined on all of H, see also §5.11 and §5.12. As in 
Stone’s Theorem 5.73, this problem is solved by finding a suitable domain in H on 
which the underlying limit, taken strongly, does exist. This is the Garding domain 


Do= {wl (fw. feCr(G).weH}, (5.171) 


where for each f € C®(G) (or even f € L!(G)) the operator u/ (f) is defined by 


ul (f= [ax stout) (5.172) 


Like the derivative uv’, this integral is most easily defined weakly, i.e., the (bounded) 
operator ul (f) is initially defined as a bounded quadratic form 


O(@.y) = [ dx f (x)(@,u(x)y), (5.173) 


from which the operator us (f) may be reconstructed as in Proposition B.79. Note 
that the function x + (@,u(x)W) is in C,(G), so that the integral (5.173) exists. 

It can be shown that Dg is dense in H, as well as invariant under u'(g), in the 
sense that if yw € Dg, then u'(A) yw € Dg for any A € g. Furthermore, for each 9 € Dg 
the function x ++ u(x)@ from G to H is smooth (if G is unimodular this property 
even characterizes Dg). The commutation relations (5.157) then hold on Dg, but 
the equalities (5.164) do not: one has to choose between (5.157) and (5.164), since 
the latter holds for the closure of each 7(A) (i.e., each ip(A) is essentially self- 
adjoint on Dg), whose domain however depends on A: there is no common domain 
on which each ip (A) is self-adjoint and the commutation relations (5.157) hold. 
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5.8 Irreducible representations of SU (2) 


One of the most important groups in quantum physics is SU (2), both as an internal 
symmetry group—e.g. of the Heisenberg model of ferromagnetism, of the weak nu- 
clear interaction, and possibly also of (oop) quantum gravity—and as a spatial sym- 
metry group in disguise (all projective unitary representations of SO(3) come from 
unitary representations of SU(2), preserving irreducibility, cf. Corollary 5.61). In 
this section we review the well-known classification and construction of its unitary 
irreducible representations. Since SU (2) is compact, by Theorem 5.40 all its unitary 
irreducible representations are finite-dimensional. Since G = SU(2) is also simply 
connected, by Corollary 5.43 its irreducible finite-dimensional (unitary) represen- 
tations u bijectively correspond to the irreducible finite-dimensional skew-adjoint 
representations p =u’ of its Lie algebra g. Hence our job is to find the latter. 

We already encountered the basis (3.66) of the Lie algebra s0(3) & R* of SO(3); 
the corresponding basis of the Lie algebra su(z) of SU (2) is (S1,S2,S3), where 


Sy = —liog, (5.174) 


and the o; are the Pauli matrices given in (5.42); linear extension of the map J, +> Sx 
defines an isomorphism between $0(3) and su(z). These matrices satisfy 


[Si,Sj] = 125k, (5.175) 


where €; jx is the totally anti-symmetric symbol with €123 = 1 etc., so that (5.175) 
comes down to [S,,52] = S3, [S3,5,] = S2, and [S2,53] = S,. By linearity, finding p 
is the same as finding n x n matrices 


Lk = ip (Sx) (5.176) 
that satisfy 
[L;, Lj] = i&ipLy, (5.177) 
ie., [L,,L2] = iL;, etc., and 
Ly = Lx. (5.178) 


It turns out to be convenient to introduce the ladder operators 


Lp=ly til, (5.179) 


with ensuing commutation relations 


[L3,L4] = +L; (5.180) 
[L4.,L_] = 2L3. (5.181) 


Furthermore, we define the Casimir operator 


C=H4+3+4, (5.182) 
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which, crucially, commutes with each Lx, i.e., 
[C, Ly] = 0 (k = 1,2,3). (5.183) 
By Schur’s lemma, in any irreducible representation we therefore must have 
C=c:-lz, (5.184) 
where c € R (in fact, c > 0). We will also use the additional algebraic relations 


LiL = C-13(I5-1x); (5.185) 
L_Ly = C-L3(L3 +1y). (5.186) 


The simple idea is now to diagonalize L3, which is possible as L; = L3. Hence 


H= @ A, (5.187) 
A€o(L3) 


where o(L3) is the spectrum of L3 (which in this finite-dimensional case consists 
of its eigenvalues), and Hy is the eigenspace of Lz for eigenvalue A (i.e., if v € Hy, 
then L3v =A v). The structure of (5.187) in irreducible representations is as follows. 


Lemma 5.44. Let p : su(z) — B(H) be a finite-dimensional skew-adjoint irre- 
ducible representation, so that (5.177) holds. Then the spectrum o(L3) of the self- 
adjoint operator L3 = ip(S3) is given by 


If (5.187) is the spectral decomposition of H relative to L3, then: 


1, The subspace Hy, is one-dimensional for each 0 € o(L3); 
2. For 4 < j the operator L maps Hj, to Hy) +1, whereas L, =0 on Hj; 
3. For 4 > —j the operator L_ maps Hj, to Hj _\, whereas L_ = 0 on H_j. 


Proof. For any A € o(L3) and nonzero vy € Hy, we have: 


e either A +1 € o(L3) and Liv, € Hy, (as a nonzero vector); 
e or Liv, =0. 


Indeed, (5.180) gives L3(Li0,) = (A +1)Li0,, which immediately yields the 
claim. Similarly, either A — 1 € o(L3) and L_vy © Hy_1, or L_v, = 0. Now let 
Ao = min o(L3) be the smallest eigenvalue of L3, and pick some 0 #4 v4, € Hy. 
Since H is finite-dimensional by assumption, there must be some k € Np = NU {0} 
such that Ly, = 0, whereas all vectors EL: v4, for / = 0,...,& are nonzero (and 
lie in Hy, +1). With c defined as in (5.184), it then follows from (5.185) - (5.186) that 


c—Ap(Ap — 1) = 0; (5.189) 
c— (Ao +k)(Ao+k+1) = 0. (5.190) 


160 5 Symmetry in quantum mechanics 
These relations imply Ap = —k/2, so that by the above bullet points we also have 
{—k/2,—k/2+1,...,k/2—1,k/2} C o(Ls). (5.191) 
To prove equality, as in (5.188), consider the vector space 
H' =C-v4, BC -Ly vy, + OLY | vy, OLA VA, CH; (5.192) 
this is just the subspace of H with basis (v,,,L4V,,,--- Ey Dey.) By the 
previous arguments following from (5.180), we see that the operators L, and L_ 
never leave H’, and the same is trivially true for L3. Therefore, if p is irreducible, 


then we must have H’ = H (and conversely). All claims of the lemma are now 
trivially verified on H’. 


It should be clear from this proof that the actions of L,, L_, and Lz (and hence of all 
elements of su(z)) on H’ = H) are fixed, so that p is determined by its dimension 


dim(H) =2j+1, (5.193) 


from which it follows that j can only take the values 0,1/2,1,3/2,.... 

It remains to fix an inner product on H’ in which p is skew-adjoint, i.e., in which 
L3 = L3 and Li, = L_ (which implies that L} = L; and L5 = Ly, which jointly imply 
p(X*) = —p(X) for any X € g). This may be done in principle by starting with 
any inner product, integrating p to a unitary representation of SU (2), and using the 
construction explained at the beginning of the proof of Theorem 5.40. In practice, it 
is easier to just calculate: take H = C” with n = 2+ 1, standard inner product, and 
standard orthonormal basis (u;), labeled as / = 0,1,...,2/). Then put 


L3u, = (l— j)uy; (5.194) 
Law = (+1) (n—1- 1)ui41; (5.195) 
Leu = V/U(n—D)uy-1. (5.196) 


Note that (5.195) is even formally correct for / = 2j, since in that case n—2j —1=0, 
and similarly, (5.196) formally holds even for / = 0. The commutation relations 
(5.180) - (5.181) as well as the above conditions for skew-adjointness may be ex- 
plicitly verified, from which it follows that for any prescribed dimension (5.193) we 
have found a skew-adjoint realization of p. Clearly, uj = vj _ ;. 

In view of Theorem 5.40 and Corollary 5.43 we have therefore proved: 


Theorem 5.45. Up to unitary equivalence, any (unitary) irreducible representation 
of SU (2) is completely determined by its dimension n = dim(H), and any dimension 
n € No = NU {0} occurs. Furthermore, if j is the number in (5.188), we have 


n=2j+1. (5.197) 


Physicists typically label these irreducible representations by j (called the spin of 
the given representation) rather than by n, or even by c = j(j +1), cf. (5.184). 
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Corollary 5.43 shows that one may pass from p(su(z)) to a unitary representation 
u(SU (2)), of which one may give a direct realization. For j € No/2, define H; as the 
complex vector space of all homogeneous polynomials p in two yeuables z= (z1,22) 
of degree 27. A basis of H; is given by (23) zit 20. zz) 2 /), which has 
2j +1 elements. So dim(H Ns 2j +1. Then consider the map 


D;: SU(2) > B(H;); (5.198) 
Dj(u)f(z) = f(zu). (5.199) 

Clearly, 
Dj(e)f(z) = f(z: 12) = f(z)), (5.200) 


so Dj(e) = 1, and 


Dj(u)Dj(v) f(z) = Div) f(zu) = F(auv) = Dj(uv) F(2); 


so D;(u)Dj(v) = D;(uv). Hence D; is a representation of SU (2). 


We now compute L3 = — 5iS3 on this space. From (5.156) with u ~» D;, we have 
re os Ve ee 0 
L3 = —3iD,; (( ) = rae ( 0 eit 3 (5.201) 
so that 
d i i Of (z) f(z) 
Isf(z)=—3i7 fle tye "z9) cg = 4 (axe —2— (5.202) 
Similarly, we obtain 
Of (z 
Li f(z) =21 ua ), (5.203) 
22 
re) 
L_ f(z) = 2 Le) (5.204) 
dz 


Hence f2;(z) = Zi ENS L3foj = jf2;, and fo(z) =z! gives L3fo = —jfo. In 
general, f/(z) = dal? spans the eigenspace H, of L3 with eigenvalue A = —j +1. 
Since / =0,1,...,2,, this confirms (5.188), as well as the fact that the corresponding 
eigenspaces are all one-dimensional. The rest is easily checked, too, except for the 
unitarity of the representation, for which we refer to the proof of Theorem 5.40. 

Finally, we return to SO(3). Either explicit exponentiation (5.165), as done for 
j = 1/2 in (5.168), or the above construction of Dj, allows one to verify the crucial 
condition stated in Corollary 5.43, namely that D;(6) = 1y, for 6 € D = Zz, which 
comes down to D;(—12) = 14, This is easily seen to be the case iff j € No. 


Corollary 5.46. Up to unitary equivalence, each unitary irreducible representation 
of SO(3) is completely fixed by its dimension n =2j +1, where j € No (so that n= 1 
for spin-0, n = 3 for spin-1, n =5 for spin-2, ...), and each such dimension occurs. 
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5.9 Irreducible representations of compact Lie groups 


Because of its importance for the classical-quantum correspondence (cf. §7.1) we 
first reformulate the main result of the previous section (i.e. the classification the 
irreducible representations of SU (2)) and on that basis generalize this result to arbi- 
trary compact Lie groups. This gives a classification of great simplicity and beauty. 

We already encountered the coadjoint representation (3.100) of a Lie group G on 
g*, given by (x- @)(A) = O(x 'Ax), where x € G, @ € g*, A € g. The orbits under 
this action are called coadjoint orbits. If G = SO(3), we have g = R? under the map 


3 
x-J= Yi xi H> (x1,%2,43) =X, (5.205) 
k=l 


where the matrices J; are given in (3.66). Hence also g* = IR? under the map 


3 
OH (‘0.0.0 xHy a) : (5.206) 
k=1 


Writing R € SO(3) for a generic element x € G, analogously to (5.44), we can com- 
pute the adoint action R : A++ RAR™!, seen as an action on R?, through 


R(x-J)R! = (Rx) -J. (5.207) 
Using the fact that the angular momentum matrices transform as vectors, i.e., 


RJR" =) Ryid;, (5.208) 
J 


we find that the adjoint action of SO(3) on g, seen as R’, is its defining action. In 
general, if g = IR” and also g* & R” under the usual pairing of R” and R” through 
the Euclidean inner product, the coadjoint action of G on g*, seen as an action on 
IR”, is given by the inverse transpose of the adjoint action on g & R”. For SO(3) we 
have (R~!)? = R, so the coadjoint action of SO(3) on R? is just its defining action, 
too, and hence the coadjoint orbits are the 2-spheres S, with radius r > 0. 

Turning to SU(2), we now make the identification of g* with R? slightly differ- 
ently, namely by replacing the 3 x 3 real matrices J; in (5.205) by the 2 x 2 matrices 
S; in (5.174), but the computation is similar: using (5.44) - (5.45), we find that the 
coadjoint action of u € SU(2) on R? is given by the defining action of #(u) € SO(3), 
cf. (5.46). It follows that the coadjoint orbits for SU (2) are the same as for SO(3). 

Returning to general Lie groups G for the moment, assumed connected for sim- 
plicity, we take some coadjoint orbit @ C g*, fix a point 9 € @ (so thatG@ =G-0= 
Gg), and look at the stabilizer Gg and its Lie algebra gg. Since the derivative Ad’ of 
the adjoint action Ad of G on g—defined as in (5.156)—is given by 


Ad'(A) : B+ [A,B], (5.209) 
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it follows that the “infinitesimal stabilizer” gg is given by 
Go = {A€g| O((A,B]) =OVB Ee g}. (5.210) 


Consequently, the restriction of @ : g > R to gg C gis a Lie algebra homomorphism 
(where R is obviously endowed with the zero Lie bracket). Consider a character 
% : Gg — T, which is the same thing as a one-dimensional unitary representation 
of Gg. If we regard T as a closed subgroup of GL;(C), its Lie algebra t is given 
by iR C M,(C) = C. It is conventional (at least among physicists) to take —i as 
the basis element of t, so that t= R under —it + f, so that the exponential map 
exp : t > T (which is the usual one), seen as a map from R to T, is given by t 
exp(—it). Defining the derivative 7’ : gg — C as in (5.156), it follows that actually 
x': 99 — iR, so that iy’ maps gg to R and is a Lie algebra homomorphism. 


Definition 5.47. Let G be a connected Lie group. A coadjoint orbit C C g* is called 
integral if for some (and hence all) 6 € @ one has 4, = ix’ for some character 
4 :Gea TT, i.e. if there is a character x such that for each A € gg one has 


@(A) = ion Carer (5.211) 
In the simplest case where G = T, the coadjoint action on t* is evidently trivial, so 
that Gg = G=T for any 0 € t* = R. Furthermore, any character on T takes the 
form ¥,(z) =z", where n € Z, cf. (C.351). As explained above, if t = R and hence 
also t* = R, the identification of A € t* with A € R is made by A(—i) © A, where 
—i€t. IfX% = Xp, the right-hand side of (5.211) evaluated at A = —i equals n, so that 
(5.211) holds iff @ =n for some n € Z. Thus the integral coadjoint orbits in * are 
the integers Z C R. Similarly, if G = T@, the characters are elements of Z“, as in 


Roan Qin Sp) Soy es (5.212) 


and the integral coadjoint orbits in g* = R¢ are the points of the lattice Z¢ C R¢. 

For G = SU (2) we take a coadjoint orbit $2 C R? and fix 0, = (0,0,7r). If r= 0, 
then Gg = Gand (5.211) holds for the trivial character y = 1, so the orbit {(0,0,0) } 
is integral. Let r > 0. Then Gg, = G, consist of the pre-image of SO(2) in SU(2) 
under the projection # in (5.46), where SO(2) C SO(3) is the group of rotations 
around the z-axis. This is the abelian group 


T = {diag(z,z) |z€T}. (5.213) 


This group is isomorphic to T under diag(z,Z) +> z and hence its characters are 
given by 7,(diag(z,Z)) = z", where n € Z. The identification g* = IR? is made by 
identifying 0 € g* with (0), 62,03), where 6; = 0(S;). Putting A = S3 in (5.211), 
see (5.174), therefore gives r = n/2 for some n € N. We conclude that the coadjoint 
orbits for SU (2) are given by the two-spheres $2 C R? with r € No/2. 

Similarly, for G = SO(3) the stabilizer of (0,0,7r) is SO(2) = T itself, and putting 
A = Jy in (5.211) one finds that the coadjoint orbits are the spheres S? with r € No. 
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For any (Lie) group G, let the unitary dual G be the set whose elements are 
equivalence classes of unitary irreducible representations of G, where we say: 


Definition 5.48. Two unitary representations u; : G — U(H;), i= 1,2, are equiva- 
lent if there is unitary v : H; —> Hp such that u2(x) = vu; (x)v* for each x € G. 


The examples G = T@ as well as for G = SU (2) now suggest the following theorem: 


Theorem 5.49. If G is a compact connected Lie group, then the unitary dual G is 
parametrized by the set of integral coadjoint orbits in g*. 


Furthermore, there is an explicit (geometric) procedure to a construct an irreducible 
representation ug corresponding to such an orbit, namely by the method of geo- 
metric quantization. We will not explain this method, which would require some 
reasonably advanced differential geometry, but instead we outline the connection 
between coadjoint orbits and the well-known method of the highest weight. 

Let G be a compact connected Lie group and pick a maximal torus T C G. Let 


Wr =N(T)/T (5.214) 


be the corresponding Weyl group, where N(T) is the normalizer of T in G (ie., 
x € N(T) iff xzx~! € T for each z € T). Note that all maximal tori in compact 
connected Lie groups are conjugate, so that the specific choice of T is irrelevant. 

For example, for SU(2) we take (5.213), in which case N(T) is generated by T 
and 0, € SU(2), so that W & Go, ie., the permutation group on two variables. In 
general the Weyl group inherits the adjoint action of N(T) on T, so that Wr acts on 
T and hence also acts on t and t*; for SU(2) the action of the nontrivial element of 
Wr, i.e., image [01] of 0; € N(T) in N(T)/T), on T is given by 


[01] (diag(z,Z)) = diag(z,z), (5.215) 


so that its action on T = T is z+ z, which gives rise to actions A+ —A of Wr; on t 
and hence A ++ —A of Wr on t*. This is a special case of the following bijection: 


g*/G&=t/Wr, (5.216) 


where the G-action on g* is the coadjoint one; globally, one has G/Ad(G) = T/Wr. 
Indeed, for SU (2) the left-hand side of (5.216) is the set of spheres S? in R°, 
r > 0, whereas the right-hand side is R/G2 (where G2 acts on R by 0 +> — 6). 
In general, a given coadjoint orbit @ C g* defines a Weyl group orbit Gy in t* 
as follows: @ contains a point 0 for which T C Gg, and we take Oy to be the orbit 
through 6);. Conversely, any G-invariant inner product on g induces a decomposition 


g=tot, (5.217) 


which yields an extension of A € t* to 6, € g* that vanishes on f+. Let A C t* be 
the set of integral elements in t* (as explained after Definition 5.47). Elements of A 
are called weights. Theorem 5.51 below gives a parametrization 
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G2 A/Wr, (5.218) 


which, restricting (5.216) to the integral part A C t*, implies Theorem 5.49. 

Instead of with the quotient A /W;, one may prefer to work with A itself, as 
follows: we say that A € t* is regular if w-A for w € Wr iff w =e; this is the case 
iff A = 6, with Gg = T. For SU(2) all weights A € Z are regular except A = 0. 
The set t* of regular elements of t* falls apart into connected components C, called 
Weyl chambers, which are mapped into each other by Wr. For SU(2) one has t* = 
(—e0,0) U (0,°¢), so that the Weyl chambers are (—°°,0) and (0,0). 

One picks an arbitrary Weyl chamber Cy (for SU(2) this is (0,°¢)) and forms 


Ag=ANC,, (5.219) 


where C7, is the closure of Cy in t*. Elements of Ag are called dominant weights. 
For each element of A /Wr there is a unique dominant weight representing it in A, 
so that instead of (5.218) we may also write what Theorem 5.51 actually gives, viz. 


GHA. (5.220) 


To explain this in some detail, we need further preparation. Any (unitary) represen- 
tation u: G— U(H) on some finite-dimensional Hilbert space H restricts to T, and 
since T is abelian, we may simultaneously diagonalize all operators u(z), z € T. The 
operators iu’(A), where A € t, commute as well, so that we may decompose 


H= QD Ay, (5.221) 
HEA 


where Ay C A contains the weights that occur in uw), so that for each y € Hy, 


u(z)W = Xu(z)w ZET); (5.222) 
iu (Z)w = u(Z)w (Z €), (5.223) 


where the character 7, : T —+ T corresponding to the weight u € A is defined as 
in (5.212) with pw = (m,...,ng) and z= (z1,...,z¢) € T = T4, where d = dim(T). 
For example, we have seen that the irreducible representations D ;(SU(2)) on Hj = 
C+! contains weights in A; = {—j,-j+1,...,/—1,j}, where j € No/2. 

In particular, take H = gc with some G-invariant inner product, cf. (5.148), and 
take u = Ad, given by Ad(x)B = xBx~!, so that Ad’(A)(B) = [A,B], extended from 
g to gc: we write gc = g +ig and hence put Ad’(A)(B+iC) = [A,B] +i[A,C], where 
A,B,C © g. We assume that the inner product (-,-,) on gc is obtained from a real 
inner product on g by complexification. This inner product on g may be restricted 
to t C g and hence induces an inner product on t*, also denoted by (-,-,). For ex- 
ample, if G is semi-simple (like SU(2)), one may take the inner product on g and 
hence on gc to be the Cartan-Killing form (A,B) = —4Tr(Ad’(A)Ad’(B)), which is 
nondegenerate because G is semi-simple, and positive definite since G is compact. 
For SU (2) or SO(3) this gives the usual inner product on R? and C?. 
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Definition 5.50. The roots of g are the nonzero weights of the adjoint representation 
u = Ad on H = gc. That is, writing A C A for the set of roots, we have a € A iff 
a :t— Ris not identically zero and there is some Ey € gc such that for each Z € t, 


i[Z, Ey] = a(Z)Eq., (5.224) 


cf. (5.223). Furthermore, subject to the choice of a preferred Weyl chamber Cq in t;, 
we say @ € A is positive, denoted by a € A*, if (a@,A) > 0 for each dA € Cy. 


Since (a,/) is real and nonzero for each @ € A and A € Cy, one has either @ € At or 


—a€At,ie., a €A~ =—A?. Since tis maximal abelian in g, it can also be shown 
that each root is nondegenerate. Writing gq = C: Eq, this gives a decomposition 
gc=tc D ga GH ga. (5.225) 


acAt acA— 


For G = SU(2), the single generator of t is 53, and taking Ei = i(S; +iS2), we see 
from (5.180) that i[S3,E£+] = +E. Hence the roots are a+, given by o+(S3) = +1, 
and with (0,c¢) as the Weyl chamber of choice, the root a is the positive one. 

We now define a partial ordering < on A by putting pw <A iff A — pb =Y nia; 
for some n; € No and a; € A*. This brings us to the theorem of the highest weight: 


Theorem 5.51. Let G be a connected compact Lie group. There is a parametrization 
G & Aq, such that any unitary irreducible representation uj: G —> Hy, in the class 
A € G defined by a given dominant weight 2 € Aq has the following properties: 


1. Hy contains a unit vector Vy, unique up to a phase, such that 


iu, (Z)vq =A(Z)vq (Z Et); (5.226) 
iu’, (Eq)¥, =0 (aE A). (5.227) 


2. Any other weight Lt occurring in H, cf. (5.221), satisfies u<A and Ad. 
The crucial point is that eqs. (5.226) - (5.227) imply 


0,(A) = i(vd,,u4,(A)d,) (A Eg), (5.228) 


where 0, € g* was defined after (5.217) by A € Ag C t*. Since each operator uy (x) 
is unitary, each vector uy (x), is a unit vector, so we may form the G-orbit 


O71, = {|ug (x) dq) (ua (x) da, € G} (5.229) 


through |v,)(v4| in the space Y1(H,) of all one-dimensional projections on Hj. 
Denoting the coadjoint orbit G- 6, C g* by Oj, where A = (6, ) jt, the map 


xO, +> |g (x) Vg) (ua (x) Va |, (5.230) 


is a G-equivariant diffeomorphism (in fact, a symplectomorphism) from @, to O,. 
This amplifies Theorem 5.49 by making the the bijective correspondence between 
the set Ay of dominant weights and the set of integral coadjoint orbits explicit. 
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5.10 Symmetry groups and projective representations 


Despite the power and beauty of unitary group representations in mathematics, in 
the context of e.g. Wigner’s Theorem we have seen that in physics one should look at 
homomorphisms x ++ W(x), where W(x) is a symmetry of Y;(H). In view of The- 
orems 5.4, this is equivalent to considering a single homomorphism h: G+ Y", cf. 
(5.136). To simplify the discussion, we now drop U,(H) from consideration and just 
deal with the connected component Y!? = U(H)/T of the identity. This restriction 
may be justified by noting that in what follows we will only deal with symme- 
tries given by connected Lie groups, which have the property that each element is a 
product of squares x = y’. In that case, h(x) = h(y)? is always a square and hence 
it cannot lie in the component U,(H)/T (the anti-unitary case does play a role as 
soon as discrete symmetries are studied, such as time inversion, parity, or charge 
conjugation). Thus in what follows we will study continuous homomorphisms 


h:G—>U(H)/T, (5.231) 


where U(H)/T has the quotient topology inherited from the strong operator topol- 
ogy on U(H), as explained above. Since it is inconvenient to deal with such a quo- 
tient, we try to lift h to some map (5.137) where, in terms of the canonical projection 


m:U(H) > U(A)/T, (5.232) 
which is evidently a group homomorphism, we have 
Tou=h. (5.233) 
This can be done by choosing a cross-section s of 7, that is, a measurable map 
s:U(H)/T>U(A), (5.234) 
or (this doesn’t matter much) a map s : h(G)/T — U(#), such that 
Tos =1d. (5.235) 
Given h, such a cross-section s yields a map u : G + U(H) through 
u=soh; (5.236) 


in particular, a(u(x)) = h(x). Such a lift often loses the homomorphism property, 
though in a controlled way, as follows. Since different choices of s must differ by a 
phase, and h is ahomomorphism of groups, there must be a function 


eiGXGStT (5.237) 


such that 
u(x)u(y) = c(x,y)u(xy) (x,y € G). (5.238) 
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Indeed, since 7 and h are homomorphisms, we may compute 


m(u(x)u(y)u(xy)!) = (s(h(x))a(s(h(y))a(s(h@xy))) | 
= h(xy)h(xy)* = h(ec) = eucnyt- 


Hence u(x)u(y)u(xy)~! € m' (eyay/t) = T- 14, which yields (5.238), or, more 
directly, 
c(x,y)- ly = u(x)u(y)u(xy)*. (5.239) 


Associativity of multiplication in G and the homomorphism property of h yield 
c(x,y)c(xy,z) = c(x,yz)c(y,z), (5.240) 
and if we impose the natural requirement u, = 1y, we also have 
c(e,x) =c(x,e) =1. (5.241) 


Definition 5.52. A function c: Gx G— T satisfying (5.240) and (5.241) is called a 
multiplier or C @2-cocycle on G (in the topological case one requires c to be Borel 
measurable, and for Lie groups it should in addition be smooth near the identity). 
The set of such multipliers, seen as an abelian group under (pointwise) operations 
in T, is denoted by Z*(G,T). If c takes the form 


b(xy) 
b(x)b(y)’ 
where b: G + T satisfies b(e) = 1 (and is measurable and smooth near e as appro- 


priate), then c is called a 2-coboundary or an exact multiplier. The set of trivial 
multipliers forms a (normal) subgroup B*(G,T) of Z*(G,T), and the quotient 


c(x,y) = (5.242) 


Z(G,T) 


2 = 
H(G.T) = BGT) (5.243) 


is called the second cohomology group of G with coefficients in T. 


The reason 2-coboundaries and the ensuing group H?(G,T) are interesting for our 
problem is as follows. Given a map x +> u(x) from G to U(H) with (5.238), suppose 
we change u(x) to u(x)’ = b(x)u(x). The associated multiplier then changes to 


b(x)b(y) 
b 


c(x,y), (5.244) 
(xy) 


e'(x,y) = 


in that u(x)/u(y)! = c'(x,y)ujy. In particular, a multiplier of the form (5.242) may be 
removed by such a transformation, and is accordingly called exact. 


Proposition 5.53. [fH?(G,T) is trivial, then any multiplier can be removed by mod- 
ifying the lift u of h, and the ensuing map u' : G —> U(H) is a homomorphism 
and hence a unitary representation of G on H. In that case, any homomorphism 
G — U(H)/T comes from a unitary representation u: G — U(H) through (5.233). 
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This is true by construction. By the same token, if H7(G,T) is non-trivial, then G 
will have projective representations that cannot be turned into ordinary ones by a 
change of phase (for it can be shown that any multiplier c € Z?(G,T) is realized by 
some projective representation). Thus it is important to compute H?(G,T) for any 
given (physically relevant) group G, and see what can be done if it is non-trivial. 

To this end we present the main results of practical use. In order to state one of 
the main results (Whitehead’s Lemma), we need to set up a cohomology theory for 
g (which we only need with trivial coefficients). Let C‘(g,IR) be the abelian group 
of all k-linear totally antisymmetric maps @ : g + R, with coboundary maps 


5) - C*(g,R) + CH! (g,R); (5.245) 
k+1 aia: J . 

(X0,X1,...,Xe) 4 Yo (-1)* (Xi, Xj], Xo, Xi... Xj. Xe) (5.246) 
is jal 


where the hat means that the corresponding entry is omitted. For example, we have 


5) 9(Xo,X1) = —9([Xo,X1]); 
52) —p(X,X1,X2) = —Q([Xo,X1],X2) + G([Xo,X2],X1) — G([X1,X2],Xo). 


These maps satisfy “5? = 0”, or, more precisely, 


51) 66 =0, (5.247) 
and hence we may define the following abelian groups: 
BF (g,R) = ran(6—))); (5.248) 
Z*(g,R) = ker(6™); (5.249) 
Z*(g,R) 
k _ ) 
H"(g,R) = BY(g,R)" (5.250) 


Note that B*(g,IR) C H*(g,R) because of (5.247). In particular, for k = 2 the group 
Z?(g,R) of all 2-cocycles on g consists of all bilinear maps @ : g x g — R that satisfy 


9(X,Y) =—9(Y,X); (5.251) 
p(X, [Y,Z]) + 9(Z,[X,Y]) + e(Y, [Z,X]) =0, (5.252) 


and its subgroup B*(g,IR) of all 2-coboundaries comprises all @ taking the form 
(X,Y) =0([X,¥]), 0 eg". (5.253) 
For example, for g = R any antisymmetric bilinear map @ : R* > 0 is zero, so that 


H?(R,R) =0. (5.254) 
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This has nothing to with the fact that the Lie bracket on g vanishes. Indeed, g = R? 
does admit a unique nontrivial 2-cocycle, given by (half) the symplectic form, i.e., 


o((P.q); (p',4')) = 3(pq' — 4p’). (5.255) 
Since B*(IR*,R) =0, this cannot be removed, hence (5.255) generates H7(R7,R): 
H?(R’,R) =R. (5.256) 


As far as cohomology is concerned, each Lie group and each Lie algebra has its 
own story, although in some cases a group of stories may be collected into a single 
narrative. As a case in point, a Lie algebra g is called simple when it has no proper 
ideals, and semi-simple when it has no commutative ideals. A Lie algebra is semi- 
simple iff it is a direct sum of simple Lie algebras. If a Lie group G is (semi-) simple, 
then so is its Lie algebra g. A basic result, often called Whitehead’s Lemma, is: 


Lemma 5.54. /f g is semi-simple, then H?(g,R) = 0. 
Proof, The key point is that C'(g,R) is a g-module under the action 
k 
(Xo-@)(X1,--.,. Xe) =— ¥ O(X1,..-, Xo, Xi],--- Xk). (5.257) 
i=l 
For k = 2, a simple computation shows that 
(Xo . g)(X1 ,X2) a —([Xo,X1],X2) = 9(X1, [Xo, X2]) 

5) 9(X0,X1,X2) — 5 (Xo, —)(X1,X2), (5.258) 


where at fixed Xo, the map @(Xo,—) is seen as an element of C!(g,R). This show 
that g maps both B?(g,R) and Z*(g,R) onto itself. Indeed, if g = 5!) y, then the 
first term in (5.258) vanishes because 5(2) o &(!) = 0, cf. (5.247), so that the right- 
hand side of (5.258) takes the form 6(!)(---) and hence lies in B?(g,R). Similarly, 
if 52) =0, then 5)(Xy- @) = 0. We now use the fact that if g is semi-simple, 
then any finite-dimensional module is completely reducible. Consequently, as a g- 
module, Z?(g,IR) must decompose as Z7(g,IR) = B?(g,R) GV, where V is some 
g-module. Hence if @ € V, then Xo-@ € V. Since @ € Z7(g,R), the first term in 
(5.258) vanishes, whilst the second term lies in B*(g,IR). Since VM B?(g, R) = {0}, 
we therefore have Xo-@ = 0, and hence 5") @(Xy, —)(X1,X2) = 0, which gives 
(Xo, [X1,X2]) =, for all Xo,X1,X2 € g. At this point we use another implication of 
the semi-simplicity of g, namely [g, g] = g. It follows that @ = 0, whence V = {0}, 
from which Z?(g,R) = B?(g,R), or, in other words, H*(g,R) = 0. 


Theorem 5.55. Let G be a connected and simply connected Lie group. Then 
H?(G,T) =H’ (g,R). (5.259) 


Proof. This is really a conjunction of two isomorphisms: 
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H?(G,T) & H°(G,R); (5.260) 
H?(G,R) = H°(g,R), (5.261) 


where R is the usual additive group, and Z?(G,IR), B?(G,R), and hence H?(G,R) 
are defined analogously to Z*(G,T) etc. The first isomorphism is simply induced by 


Z?(G,R) + Z7(G,T); (5.262) 
P(x,y) 4 eT) = c(x,y), (5.263) 


which preserves exactness and induces an isomorphism in cohomology (but note 
that (5.262) - (5.263) may not itself define an isomorphism). 

The isomorphism (5.261) is induced at the cochain level, too. Given a cocycle 
~ € Z?(G,R), we construct a new Lie algebra Gq (called a central extension of g) 
by taking gg = g ®R as a vector space, equipped though with the unusual bracket 


[(X,v), (¥,w)] = (IX, ¥], o(X,Y)); (5.264) 


the condition @ € Z?(G,R) guarantees that this is a Lie bracket. Furthermore, Go 
is isomorphic (as a Lie algebra) to a direct sum iff @ € B*(g,R); indeed, if (5.253) 
holds, then (X,v) ++ (X,v+ 0@(X)) yields the desired isomorphism gy > g DR. 
By Lie’s Third Theorem, there is a connected and simply connected Lie group 
Gg (again called a central extension of G), with Lie algebra gg, As a manifold, 
Gg = Gx R, but the group laws are given, in terms of a function I”: Gx G— R, by 


(x,v)-Q,w) = Qy,v+w+T(x,y)); (5.265) 
(x,v)—! = (71, -v—I'(x,x7!)). (5.266) 


The group axioms then imply (indeed, they are equivalent to) the condition I" € 
Z*(G,R). Furthermore, two such extensions Gg and Gs are isomorphic iff the cor- 
responding cocycles I and I’ are related by (5.244), and in particular, [ € B?(G,R) 
iff Gg is isomorphic (as a Lie group) to a direct product G x R, which in turn is the 
case iff @ € B?(g,R). Conversely, given € Z*(G,R), we define the central exten- 
sion Gg by (5.265) - (5.266), to find that the associated Lie algebra g@ takes the 
above form, defining @ € B?(g,R) through (5.264). Explicitly, 


@(X,Y)=—— [Fr (e*,e")] 9 —(X OY). (5.267) 


Lie’s Third Theorem thus implies that the map @ «+ I" (which is not necessarily a 
bijection) descends to an isomorphism H*(g,R) + H?(G,R) in cohomology. 


Given (5.254), Theorem 5.55 immediately gives 
H?(R,T) =0. (5.268) 


In particular, if R is the relevant symmetry group, which is the case e.g. with time 
translation, by Proposition 5.53 we may restrict ourselves to unitary representations. 
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Once again, this has nothing to do with abelianness or topological triviality of R. 
Indeed, for G = g = R’, the Heisenberg cocycle (5.255) comes from the multiplier 


co((p,q); (p',)) = eld), (5.269) 


where R? is seen as the group of translations in the phase space R? of a particle 
moving on R. Accordingly, this multiplier is realized by the following projective 
representation of R* on L?(R): 


u(p,g) w(x) =e (Pte (xq). (5.270) 


If R? is the configuration space of some particle, and the group R* produces trans- 
lations in the latter (i.e., of position), then the appropriate unitary representation 
would rather be on L? (IR?) and would have trivial multiplier, viz. 


u(qi,92) W(x1,x2) = W(X — q1,%2 — 2). (5.271) 


Similarly, G = R*, now seen as generating translations of momentum in the phase 
space R* of the latter example would appropriately be represented on L?(IR7) as 


u(qi.g2) W(x1,X2) = e122) w(x) ,x9). (5.272) 


Corollary 5.56. Let G be a connected and simply connected semi-simple Lie group. 
Then H?(G,T) is trivial. 


Here we say that a Lie group is simple when it has no proper connected normal sub- 
groups, and semi-simple if it has no proper connected normal abelian subgroups. 
For example, the “classical Lie groups” of Weyl are semi-simple, including SO(3) 
and SU(2), which are even simple (note that the latter does have a discrete nor- 
mal subgroup, namely its center {+12} © Zo). Also, products of simple Lie groups 
are semi-simple. However, Corollary 5.56 does not apply to SO(3), which is semi- 
simple but not simply connected. Here the relevant general result is: 


Theorem 5.57. Let G be a connected Lie group with H*(g,R) = 0. Then 
H2(G,T) £m (6). (5.273) 


We need some background (cf. §C.15). For any abelian (topological) group A, the 
set 
A =Hom(A,T) (5.274) 


consists of all (continuous) homomorphisms (also called characters) y : A > T; 
these are just the irreducible (and hence necessarily one-dimensional) unitary rep- 
resentations of A. This set is a group under the obvious pointwise operations 


X1H2(4a) = X1(a)X2(a); (5.275) 
x '(a)=x(a)". (5.276) 
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As such, the group A is called the (Pontryagin) dual of A; the Pontryagin Duality 
Theorem states that AX A. Using Theorem 5.57 and Theorem 5.41, this gives 


H’ (SO(3),T) = Zn. (5.277) 
We now use Theorem 5.41 as a lemma to prove Theorem 5.57: 


Proof. We first state the map m1 (G) — H?(G,T) that will turn out to be an isomor- 
phism. Assuming Theorem 5.41, pick a (Borel measurable) cross-section 


mGESG (5.278) 


of the canonical projection 

#:G—>G=G/D. (5.279) 
As always, this means that 7 oS = idg, and § is supposed to be smooth near the 
identity, and chosen such that 5(eg) = eg, where eg and eg are the unit elements of 


G and G, respectively. Given a character ¥ € 7(G), define Cy: Gx GT by 


cz (x,y) = 1 (8(x)5(y)S(xy)!). (5.280) 


This makes sense: # is ahomomorphism, so that (cf. the computation below (5.238)) 


ft(5(x)5(y)S(xy)~!) = #(5(x))#(S(9))H(Say)) | = ayy)! = es, 


and hence 5(x)s(y)S(xy)~!) € ker(#) = D (where we identify D with 7(G), cf. 
Theorem 5.41). Furthermore, tedious computations show that (5.240) and (5.241) 
hold, so that cy € Z(G,T). Different choices of § lead to equivalent 2-cocycles c, 
and hence by taking the cohomology class [cy] of cy we obtain an injective map 


m(G) -> H(G,T); (5.281) 
14 [cy]. (5.282) 


To prove surjectivity of this map, let c € Z?(G,T) and define €: Gx G > T by 


NX 


(%9) = c(4(x), 4(y)). (5.283) 


Conversely, we may recover c from é and some cross-section §: G + G of 7 by 


c(x,y) = E(5(x), 5). (5.284) 
It follows that ¢ € Z?(G,T). Theorem 5.55 implies that H*(G,T) is trivial, so that 
(4,5) = (RV) /O(5) B19), (5.285) 


for some function b : G > T satisfying b(@) = 1. From (5.241), i.e., c(e,x) = 1, we 
infer that if <= 6 € D, so that #(d) =e, then ¢(6,) = 1, and hence 
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b(85) =b(5)B(5). (5.286) 


Taking X and ¥ both in D, we see that b p 1s acharacter, which we call 7. Hence 


ye 35, (5.287) 


= 1(5(x)5(y)s(xy)“") = cx (2, y). 


Thus {c] = [cy], and hence the map (5.281) - (5.282) is surjective. 


Definition 5.58. In the situation and notation of Theorem 5.41, a unitary represen- 
tation i: G + U(A) is called admissible if f(D) C T- 1. 


In that case, there is obviously a character ¥ € D such that for each 5 € D we have 
ai(d) = 7(8)- Ly. (5.288) 


Unitary irreducible representations are admissible, since Schur’s Lemma implies 
that, since D lies in the center of G, its image a#(D) consists of multiples of the unit. 
If i is admissible, we obtain a homomorphism (5.231) by means of 


h = moiios, (5.289) 


where § is any cross-section of 7, cf. (5.278) - (5.279). Note that different choices 
5,5 are related by s’(x) = 5(x)6(x), where 6 : G — D is some function, so that 


W(x) = m(a(s'(x))) = 2(G(6(x) (5 (x))) = 2(G(S(x))) (4 (x) - Liz) = h(a). 


Theorem 5.59. /. If G is a connected Lie group with H*(g,R) = 0, any homomor- 
phism h: G —> U(H)/T as in (5.231) comes from some admissible unitary rep- 
resentation it of G by (5.289). If H is separable, then h is continuous iff it is. 

2. Moreover, if ai(G) is super-admissible in that a(S) = 1y for each & € D, then 
u = 0S is a unitary representation of G, in which case h = 1 ou therefore comes 
from a unitary representation of G itself. 


Proof. Given such a homomorphism h, pick a cross-section s : U(H)/T + U(H), as 
in (5.234), with associated 2-cocycle c on G given by (5.239). By Theorem 5.57 and 
its proof, we may assume (possibly after redefining s) that there exists a character 
XE D and a cross-section (5.278) such that c = cy, cf. (5.280). We then define 
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i: G > B(H); (5.290) 
EH X(K (Fo F(X) ")u(A(®)). (5.291) 


Simple computations then show that £- (So #(%))~! € D (ie., the center of G), that 
(5.288) holds, that each operator ii(*) is unitary, that the group homomorphism prop- 
erties Hi(X)i(§) = a(&F) and #(é) = 1y hold, and that (5.289) is valid. As to the last 
equation, since 7 removes the term with 7 in (5.291), and u = soh, we have 


10 05(x) = TosohofoS(x) =h(x), 


since 70s =id (on U(H)/T) and # of = id (on G). 
If (5) = 1x for each 6 € D, then cy = 1 from (5.280), so that u(x)u(y) = Uxy 
by (5.238). If s preserves units, or, equivalently, if he = 1y, as we always assume, 
we see that u is a unitary representation of G. In this case, (5.291) simply reads 
ui = soho. This immediately yields 7 = uo Z, which in turn gives u = oS. 
Finally, even if h is continuous, it is a priori unclear if a is, since the cross- 
sections s and § appearing in the above construction typically fail to be continuous. 
Fortunately, since they are assumed measurable, there is no question about measur- 
ability of #, and if H is separable, continuity follows from Proposition 5.36. 


Corollary 5.60. If G is a connected Lie group with covering group G, the formulae 


fi = uo tt; (5.292) 
u = ios, (5.293) 


where §: G > G is any cross-section of the covering map & : G — G, give a bijective 
correspondence between (continuous) super-admissible unitary representations ti of 
G and (continuous) unitary representations u of G, preserving irreducibility. 


Corollary 5.61. Any homomorphism h: SO(3) + U(H)/T as in (5.231) comes from 
an admissible unitary representation it of SU (2) by (5.289). Moreover, h comes from 
a unitary representation u = iio § of SO(3) itself iff i is trivial on the center Zo. 

In particular, if h is irreducible, it must come from the unitary irreducible rep- 
resentations ti = Dj, where j =0,5,1,... is the (half-) integer spin label. Then 
Dj(SU(2)) is super-admissible iff j is integral, in which case it defines a unitary 
irreducible representation of SO(3). 


Indeed, the assumption H?(g,R) = 0 in Theorem 5.59 is satisfied for SO(3) be- 
cause of Whitehead’s Lemma 5.54. The case where H*(g,R) 4 0 occurs e.g. for 
the Galilei group (cf. §7.6). It can be shown that H?(g,R) has finitely many gen- 
erators, for which one finds pre-images (@,...,@y) in Z7(g,IR), with correspond- 
ing elements (Ii,...,Iiz) of Z*(G,R), cf. the proof of Theorem 5.55. Of these, a 
subset (Ij,...,Iv), N <M, satisfies the relation [[(6,%) = Ij(%,6) for any 6 € D 
(cf. Theorem 5.41) and % € G. This yields a map I: Gx G > R® given by 
T(%,9) = (i (%,9),.-., Iv (%,5)), which in turn equips the set 


G=GxR\, (5.294) 
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with a group multiplication (%,v) - (¥,w) = (&9,v +w+TI (%,§)). We then have the 
following generalization of Theorem 5.59, in which a unitary representation u of G 
is called admissible if u(5,v) € T-1y for any 6 € Dandv Ee RY. 


Theorem 5.62. Let G be a connected Lie group, and H a separable Hilbert space. 
Then any continuous homomorphism h:G — U(H)/T comes from some admissible 
continuous unitary representation ii of G. 


As we only apply this to the Galilei group (where N = 1), basically only for illus- 
trative purposes, we omit the proof. The correct (and natural) notion of equivalence 
of projective representations is as follows: we say that two such homomorphisms 
h;: G + U(H;)/T, i= 1,2 are equivalent if there is a unitary w : H, —> Hp such that 


Ady (hy (x)) = ho(x), x € G, (5.295) 


where Ad, : U(H,)/T — U(A2)/T is the map [u] +> [vuv*], which is well defined 
(here [u] is the equivalence class of u € U(H) in U(H)/T under u ~ zu, z € T). 
This induces the following notion for G: two admissible unitary representations 
ii , 2 of G on Hilbert spaces H,, H> are equivalent if there is a unitary w: H, — Ho 
and amap b: G-> T such that wu) (*)w* = b(¥)u2 (x), for any ¥ € G. It can be shown 


that such a map b always comes from a character ¥ : G > T through b(%,v) = 7(). 


To close this long and difficult section, in relief it should be mentioned that the 
above theory vastly simplifies if H is finite-dimensional. By Theorem 5.40, this is 
true, for example, if G is compact and u is irreducible. Suppose u : G > U(H) is 
merely a projective unitary representation of G, so that instead of (5.157) one has 


[u'(X),u'(Y)] =u'([X,¥Y]) +ip(X,Y)-1n, (5.296) 
where @ is given by (5.267). Taking the trace yields 

9(X,¥) = *Tr(w ((x,¥))), (5.297) 

where n = dim(H) < oo, We may define a linear function 0 : g > R by 
9(X) = “tr(w (X)), (5.298) 
so that 9(X,Y) = 0([X,Y]), cf. (5.253), and hence we may remove @ by redefining 
i (X) =u'(X)+i0(X)-1y, (5.299) 
which satisfies (5.157) - (5.158). Hence by Corollary 5.43 the map i’ exponentiates 
to a unitary representation @ of the universal covering group G of G; it should be 
checked from the values of i on D if a also defines a unitary representation of G. 


This argument shows that finite-dimensional projective unitary representations of 
Lie groups always come from unitary representations of the covering group. 
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5.11 Position, momentum, and free Hamiltonian 


The three basic operators of non-relativistic quantum mechanics are position, de- 
noted g, momentum, p, and the free Hamiltonian io. Assuming for simplicity that 
the particle moves in one dimension, these are informally given on H = L?(R) by 


qu(x) = x(x); (5.300) 
d 
PY(x) = —ih TW ()s (5.301) 
x 
ir a? 
how (x) = Fm qa VO), (5.302) 


where m is the mass of the particle under consideration. We put # = | and m = 1/2. 

The issue is that these operators are unbounded; see §B.13. In general, quantum- 
mechanical observables are supposed to be represented by self-adjoint operators, 
and examples like (5.300) - (5.302) show that these may not be bounded. The 
Hellinger—Toeplitz Theorem B.68 then shows that it makes no sense to try and ex- 
tend the above expressions to all of L7(IR), so we have to live with the fact that some 
crucial operators a : D(a) > H are merely defined on a dense subspace D(a) C H. 

Each such operator has an adjoint a* : D(a*) — H, whose domain D(a*) C H 
consists of all y € H for which the functional @ + (w,a@) is bounded on D(a), 
and hence (since D(a) is dense in H) can be extended to all of H by continuity 
through the unique “Riesz—Fréchet vector” 7 for which (w,ag) = (7,9). Writing 
x =a" yw, for each y € D(a*) and @ € D(a) we therefore have 


(a"y,) = (Y,ag). (5.303) 


Assuming that D(a) is dense in H, we say that a is self-adjoint, written a* = a, if 


(a9, W) = (9, ay), (5.304) 


for each y,@ € D(a) and D(a*) = D(a). A self-adjoint operator a is automatically 
closed, in that its graph G(a) = {(w,ay) | w € D(a)} is a closed subspace of the 
Hilbert space H @H (indeed, the adjoint of any densely defined operator is closed, 
see Proposition B.72). In practice, self-adjoint operators often arise as closures of 
essentially self-adjoint operators a, which by definition satisfy a** = a*. Equiva- 
lently, such an operator is closable, in that the closure of its graph is the graph of 
some (uniquely defined) operator, called the closure a~ of a, and furthermore this 
closure is self-adjoint, so that a~ = a*. If a is closable, the domain D(a_ ) of its 
closure consists of all yw € H for which there exists a sequence (Y;,) in D(a) such 
that y, + w and ay, converges, on which we define a~ by a” yw = lim, aWp. 
The simplest case is the position operator. 


Theorem 5.63. The operator q is self-adjoint on the domain 


Dia) ={¥ EP B)| [ dxs?lyla)P <a}, (5.305) 
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See Proposition B.73 for the proof. To give a convenient domain of essential self- 
adjointness (also for the other two operators), we need a little distribution theory. 


Definition 5.64. The Schwartz space .7(R) (whose elements are functions of 
rapid decrease) consist of all smooth function f : R — C for which each expression 


IIf| 


where f\") is the m’th derivative of f, is finite. The topology of S (R) is given by 
saying that a sequence (or net) f;, converges to f iff \\|fa —f||nm — Oforalln,meEN. 


nm = sup{|x"f (x)|,x € R}, (5.306) 


Each || - ||. happens to be a norm, but positive definiteness is nowhere used in the 
theory below (which therefore works for families of seminorms, which satisfy the 
axioms of a norm expect perhaps for positive definiteness). Since there are countably 
many such (semi)norms defining the topology, we may equivalently say that (R) 
is a metric space defined by 


d(f,g) = y g-n_ NF =8llam 


(5.307) 
n,m=0 1+||f —gllam 


Indeed, . (IR) is complete in this metric. A typical element is f(x) = exp(—x’). 


Definition 5.65. A tempered distribution is a continuous linear map @ : (IR) > 
C. The space of all such maps, equipped with the topology of pointwise convergence 
(i.€., Q, > @ iff 0, (f) + O(f) for each f € S(R)) is denoted by F'(R). 


It can be shown that (because of the metrizability of .“(R)) continuity is the same 
as sequential continuity, i.e., some linear map @ : .“(R) > C belongs to .”’(R) iff 
limy 9( fv) = 9(f) for each convergent sequence fy > f in /(R). Like .7(R), 
the tempered distributions .7’(R) form a (locally convex) topological vector space, 
that is, a vector space with a topology in which addition and scalar multiplication 
are continuous. The topology of .7’(R) is given by a family of seminorms, namely 
\lOlly =|9(f)|. f © “(R), and hence a simple way to prove that g € .Y’(R) is to 
find some (n,m) for which |@(f))| <C||f||nm for each f € .W(R), since in that case 
fw — f, which means that || fy — f|lnyn 3 0 for all n,m € N, certainly implies that 
(fv) > @(f), so that @ is continuous. For example, the evaluation maps 6, defined 
by 6,.(f) = f(x) are continuous (take n = m = 0). Similarly, each finite measure on 
R defines a tempered distribution. Taking the (0,7) seminorm shows that the maps 
f > f™ (x) for fixed m € N and x € R are tempered distributions. 
A less obvious example (defining a so-called Gelfand triple) is as follows: 


Proposition 5.66. We have continuous dense inclusions 
A(R) CL?(R) c.#'(R), (5.308) 


where the second inclusion identifies p € L? (IR) with the map 


f+ Gf) = [ axe) s(s) (5.309) 
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Proof. As vector spaces, the first inclusion is obvious. For f € (IR) we estimate 


Ills = if dx|f(x)|-F@)I S [I Fllall flees (5.310) 
(1 +x?) f(x) 
Ivih = fe Cee sf ay yl tml 
7 ( 30) (5.311) 
so that, noting that || - ||o,0 = || - ||, we have 
[FI2 < (IF loo + IF ll2.0) Il flle- (5.312) 


Hence f, + f in (IR), which incorporates the conditions ||f, — flloo — 0 
and ||, — f|l20 — 0, implies || f, — f||2 + 0. This shows that the first inclusion 
in (5.308) is continuous. Density may be proved in two steps. First, take some 
fixed positive function h € C?(—1,1) with the property [dxh(x) = 1, and define 
h,(x) =nh(nx), so that informally h, € C2 (IR) converges to a 6-function as n — 9, 
For each y € L(R), we consider the convolution h, * y, where for suitable f, g, 


fg )= [ay fle y)g(). (5.313) 


Then hy, * y € C*(R) NL?(R) and, from elementary analysis, ||» * w— y|| > 0. 
Second, for y € C,(R), the functions h, * y lie in C?(R) and hence in .(R). 

Since C.(IR) is dense in L7(R) by Theorem B.30, for y € L7(IR) and ¢ > 0 we 

can find @ € C,(R) such that || y— @|| < €/2, and (as just shown) find n such that 

|| — n|| < €/2, whence || y — @,|| < €. This proves that .7 (IR) is dense in L?(R). 
The second inclusion is continuous by Cauchy—Schwarz, which gives 


lef) < lellallflle, 


to be combined with (5.312). It should be noted that also the second inclusion in 
(5.308) is indeed an injection, i.e., that p(f) = 0 for each f € (R) implies p =0 
in L?(R); this is true because . (IR) is dense in L7(R), plus the standard fact that, in 
any Hilbert space H, if (g, f) =0 for all f in some dense subspace of H, then » = 0. 
Finally, the fact that L?(IR) is dense in the seemingly huge space .7’(R) follows 
from the even more remarkable fact that ./(R) is dense in .7’(IR). On top of the 
functions h, just defined, also employ a function 7 € C?(R) such that 7(x) = 1 on 
(—1,1), and define 7, (x) = 7(x/n), so that informally lim,_,.. 7 (x) = 1 (as opposed 
to the h,, which converge to a 6-function as n — o°). If for any g € (IR) and any 
o € #'(R) we define g@ as the distribution that maps f € .“(R) to @(fg), and 
similarly define g « @ as the distribution that maps f to @(g* f), we may define a 
sequence of distributions @, = hy * (%n@). From the point of view of (5.308), these 
correspond to functions @, € -/(R) in the sense that @,(f) = [ dx @n(x)f (x), where 
f € A(R). Using similar analysis as above, it then follows that for any f € “(R) 
we have @,(f) > @(f), so that @, > g in ./’(R). 
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For our purposes, the point of all this is that we can define generalized derivatives 
of (tempered) distributions, and hence, because of (5.308), of functions in LV’ (R). 


Definition 5.67. For 9 € .(R) andm€N, the m’th generalized derivative o”) 
is defined by 

9 (f) =(-I"9(f™). (5.314) 
The idea is that under (5.308) this is an identity if @ € (R) (partial integration). 
Like the constructions at the end of the proof of Proposition 5.66, this is a special 
case of a more general construction: whenever we have a continuous linear map 
T : /(R) > A(R), we obtain a dual continuous linear map T’: .7(IR)’ > -/(R)’ 
defined by T’~ = oT, ie., 


(T'e)(f) = e(T(f)). (5.315) 


Sometimes a slight change in the definition (as in (5.314), or as in the Fourier trans- 
form below) is appropriate so that the restriction of T’ to .(R) coincides with T. 


Theorem 5.68. The momentum operator p = —id /dx is self-adjoint on the domain 

D(p) ={yeL(R)|w EL’ (R)}, (5.316) 

where the derivative w' is taken in the distributional sense (i.e., letting Wy € '(R)). 
Proof. We first show that p is symmetric, or p C p*. This comes down to 

(w',0) =—(W,@’), (5.317) 


for each y,@ € D(p), where both derivates are “generalized”. The most elegant 
proof (though perhaps not the shortest) uses the Sobolev space H!(R), which equals 
D(p) as a vector space, now equipped, however, with the new inner product 


(WP) = (V9) +(W@"), (5.318) 
with both inner products on the right-hand side in L7(IR); the associated norm is 
Iya = lvl? + ily. (5.319) 
Similar to the Gelfand triple (5.308), we have dense continuous inclusions 
Y(R) CH'(R) c.A(R), (5.320) 
with analogous proof. All we need for Theorem 5.68 is the first inclusion of the 


triple (5.320): for y € H'(IR) we now have hy, * y € C*(IR)M H!(R) as well as 
hyn * Ww — yin H'(R), both of which follow from the L?-case plus the identity 


(hn * W)! =hy x (5.321) 


Using the same cutoff function 7 as in the L* case, we have y,w — y and y/ y > 
0 in L7(R), so that (nw)! > y’ in L?(R) and hence x,y — y also in H!(R). 
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Furthermore, the functions Wn = fn * (X%nW) lie in C2 (IR) and hence in .7(R); using 
the above facts we obtain y;,, > y in H!(R). In sum, for each y € H!(R) we can 
find a sequence (y,,) in (IR) such that y, + y and y', > y' in L?(R). Hence 


(V7, 9") =lim(Yn, 9’) = —lim( Yq, 9) = — ("9"). (5.322) 
For the converse, let y € D(p*), so that by definition for each @ € D(p) we have 


(P*V,) = (W. pe) =—i(y, 9’). (5.323) 


Since .7(R) C D(p), this is true in particular for each @ € (IR), in which case 
the right-hand side equals —iy'(@), where the derivative is distributional. But this 
equals (p*y,@) and so the distribution —iy’ is given by taking the inner product 
with p*y € L?(R). Hence —iy’ = p*y € L?(R), and in particular y’ € L?(R), so 
that y € D(p). This proves that D(p*) C D(p), and since from the first step we have 
the oppositie inclusion, we find D(p*) = D(p) and p* = p. 


For the free Hamiltonian hg = —A with A = d?/dx*, we similarly have: 


Theorem 5.69. The free Hamiltonian hy = —A is self-adjoint on the domain 
D(A) = {weL’(R)| wv" €L’(R)}, (5.324) 


where the double derivative w"' is taken in the distributional sense. 


Although this may be proved in an analogous way, such proofs are increasingly 
burdensome if the number of derivatives gets higher. It is easier to use the Fourier 
transform (which also provided an alternative way of proving Theorem 5.68). 


Theorem 5.70. The formulae 
*° dx 


ia — —ikx 2 
f(k) = ys ioral f (x); (5.325) 
F(x) aE F(k), (5.326) 


= ——€# 
00 V2 


are rigorously defined on .Y (IR), L?(IR), and .'(R), and provide continuous iso- 
morphisms of each of these spaces. Furthermore, (5.326) is inverse to (5.325), i.e. 


4 y 
g X 


f=f=f, (5.327) 


so that we may (and often do) write f = ¥(f) and f = ¥~'(f), or f = F-\(f). 
In all three cases we have the identities (in a distributional sense if appropriate) 


F (x" fl) )(k) = (id [dk )" ik)" F (F(R). (5.328) 
Finally, as a map F : L?(R) + L?(R) the Fourier transform is unitary, so that 


(W, 6) = (V9). (5.329) 
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See §C.15 for further discussion. For example, we have 


D(p) = {wEL’(R) |k- Wk) EL?(R)}; (5.330) 
D(A) = {we L’(R) | -W(k) € L’(R)}. (5.331) 


Thus we may now reformulate Theorems 5.68 and 5.69 as follows: 


Theorem 5.71. The momentum operator p is self-adjoint on the domain (5.330). 
The free Hamiltonian hy = —A is self-adjoint on the domain (5.331). 


Proof. Denoting multiplication by x” by the symbol k”, we have 


p= F'kF; (5.332) 
A=-F'KPF. (5.333) 
Hence the theorem follows from Proposition B.73 and unitarity of the Fourier trans- 


form ¥ (plus the little observation that if a = a* on D(a) CH and u: H — K is 
unitary, then b = uau* is self-adjoint on D(b) = uD(a) C K). 


Much is known about regularity properties of functions in such domains, e.g., 


D(p) C Co(R); (5.334) 
D(A) c CM) (R). (5.335) 


These are the most elementary cases of the famous Sobolev Embedding Theorem. 
If w€ D(p), then k++ (1 +k)!/2@(k) is in L?(R), so applying Hélder’s inequal- 
ity (B.15) with p =q =2 to f(k) = (1 +k?)!/2 W(k) and g(k) = (1+k2)~!/?, which 
is in L7(R), too, gives Y € L'(IR). The Riemann—Lebesgue Lemma (see §C.15) then 
yields y € Co(R). To prove (5.335), one uses (1 +) rather than its square root. 
Finally, we give a common domain of essential self-adjointness for g, p, and ho. 


Proposition 5.72. The operators q, p, and ho are essentially self-adjoint on /(R). 


Proof. We see from (5.332) that the cases of p and q are similar, so we only explain 
the case of g. Denoting the operator of multiplication by x on the domain .7(R) by 
qo, as in the proof of Proposition B.73 it is easy to see that D(q}) = D(q). Fourier- 
transforming, the fact that . (IR) is dense in H!(R) (cf. the proof of Theorem 5.68) 
shows that D(q9 ) = D(q),so that D(q}) = D(qo ). The actions of gj and gy obvi- 
ously being given by multiplication by x in both cases, we have qj = qo - 

The proof for ho is similar; in the second step we now use the fact that .(R) is 
dense in H*(IR), defined as D(A), as in (5.324), but now seen as a Hilbert space in 
the inner product (y, ) (2) = (W,@) + (w", 0”), with corresponding norm given by 
| Vili) = || ||? +||w\|?. This is proved just as in the case of a single derivative. 


We also say that .(R) is a core for the operators in question. For example, the 
canonical commutation relations [q, p] = if- 1q rigorously hold on this domain. 
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5.12 Stone’s Theorem 


We now come to a central result on symmetries in quantum mechanics “explaining” 
the Hamiltonian. Recall that a continuous unitary representation of R (as an additive 
group) on a Hilbert space H is a map t +> u;, where t € R and each u, € B(A) is 
unitary, such that the associated map R x H — H, (t, y) > u,Y, is continuous, and 


UsUy = Usit, S,t © R; (5.336) 
uo = lu; (5.337) 
limuy = y ((€R, ye). (5.338) 
t—0 
These conditions imply 
limu, y= us (st ER, wed). (5.339) 
Ss 


Note that according to Proposition 5.36 continuity may be replaced by weak mea- 
surability. Probably the simplest nontrivial example is given by H = L?(IR) and 


up W(x) = W(x—-t). (5.340) 


To prove (5.338), we use a routine €/3 argument. We first prove (5.338) for 
w €C,(R), where it is elementary in the sup-norm, ice., lim;-50 ||“, y — ||. = 0 
by continuity and hence (given compact support) uniform continuity of y. But then 
the (ugly) estimate || ||} < |K|||w||.., where K C R is any compact set containing 
the support of y, also yields lim,_,o ||u; yy — w||2 = 0. Hence for € > 0 we may find 
5 > 0 such that ||u;y— wl||2 < €/3 whenever |t| < 6. For general y’ € H, we find 
w €C,(R) such that ||y— w’|| < €/3, and, using unitarity of u,, estimate 


lary! — w"|| < llr! — ue y| + |r — yl] + lly — | 
< €/3+€/3+€/3=€. 


In the context of quantum mechanics, physicists formally write 
u=e 4, (5.341) 


where a is usually thought of as the Hamiltonian of the system, although in the 
previous example it is rather the momentum operator. In any case, we avoid the 
notation h instead of a here, partly in order to rightly suggest far greater generality 
of the construction and partly to avoid confusion with the notation in §B.21; if h is 
the Hamiltonian, one would have a = h/h in (5.341). Mathematically speaking, if a 
is self-adjoint, eq. (5.341) is rigorously defined by Theorem B.158, where 


r(x) = exp(—itx). (5.342) 
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Conversely, given a continuous unitary representation t +> u; of R on H, one may 
attempt to define an operator a by specifying its domain and action by 


D(a) = {v €H |lim ““— exists} ; (5.343) 
sO) OS 
s—l 
ay = ilim ~—y (w € D(a)). (5.344) 
sO 8 


Stone’s Theorem makes this rigorous, and even turns the passage from the generator 
a to the unitary group ¢ +> u; (and back) into a bijective correspondence. 


Theorem 5.73. 1. Ifa: D(a) > H is self-adjoint, the map t ++ u, defined by (5.341), 
which is rigorously defined by Proposition B.159 with (5.342), defines a contin- 
uous unitary representation of R on H. 

2. Conversely, given such a representation, the operator a defined by (5.343) - 
(5.344) is self-adjoint; in particular, D(a) is dense in H. 

3. These constructions are mutually inverse. 


Proof. We use the setting of §B.21, so that b is the bounded transform of a. 


1. Eqs. (5.336) - (5.337) are immediate from Theorem B.158, which also yields 
unitarity of each operator u,. To prove (5.338) we first take @ € C2(b)H, which 
means that @ is a finite linear combinations of vectors of the type @ = h(a)y, 
where h € C,(o(a)) and y € H. Using (5.342) and (B.573), we have 


[4 P — Pll S lle — hlool| Wl < [llooller — Ux. Yl, (5.345) 


where K is the (compact) support of h in o(b). Since the exponential function 
is uniformly convergent on any compact set, this gives lim,_s9 ||u;@ — || = 0. 
Taking finite linear combinations of such vectors @ gives the same result for any 
9 € Ci(b)H (with an extra step this could have been done on Cp (b)H, too). 
Thus for € > 0 we can find 6 > 0 so that ||u;@ — || < €/3 whenever |t| < 6. For 
general y’ € H, we find @ € C)(b)H such that ||p — y’|| < €/3, and estimate 


ur’ — w'|| < lw’ — wel] + lle — ell +lle-¥'| 
< €/3+8/3+6€/3=€, 


since ||u,y' — u,@|| = || w’ — @|| by unitarity of u,. This is equivalent to (5.338). 
2. For any y € H andn EN, define y,, € H by 


ween | dse~™usy, (5.346) 
0 


either as a Riemann-type integral (whose approximants converge in norm) or as 

a functional 9 > n fy dse~™ (us, @), which is obviously continuous and hence 
is represented by a unique vector y,, € H. Then simple computations show that 

. Us—1 

lim 

so0) 6S 


Vn =1(Wn— VW), 
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so that y;, € D(a). The proof that y,, > y starts with the elementary estimate 
Ivo— vil <n f dse™|asw— vl 


in which we split up the {5° as ie --- + f5°--+, where 5 > 0. Using strong con- 
tinuity of the map f +> uy, 1.e., (5.338), for any n the first integral vanishes as 
6 — 0. In the second integral we estimate ||usy— y|| <2||y|| and take the limit 
n— co, Thus y,, > y, so that D(a) is dense in H. 

To prove self-adjointness of a, we need a tiny variation on Theorem B.93: 


Lemma 5.74. Let a be symmetric. Then a is self-adjoint (i.e. a* = a) iff 
ran(a+i) =ran(a—i) =H. (5.347) 


Proof. We only need the implication from (5.347) to a* =a (but the converse im- 
mediately follows from Theorem B.93). So assume (5.347). For given y € D(a*) 
there must then be a @ € H such that (a* —i)w = (a —i)@. Since a is symmet- 
ric, we have D(a) C D(a*), so w— @ € D(a*), and (a* —i)(w— 9) =0. But 
ker(a* —i) = ran(a+i)+, so ker(a* — i) = 0. Hence y = Q, and in particular 
w € D(a) and hence D(a*) C D(a). Since we already know the opposite inclu- 
sion, we have D(a*) = D(a). Given symmetry, this implies a* = a. 


Continuing the proof of Theorem 5.73.2, symmetry of a easily follows from its 
definition, combined with the property u* = u;! = u_;. Indeed, for y,@ € D(a), 
the weak limit s + 0 below exists by definition of D(a), cf. (5.343), whence: 


u_—s — 


Us — 
2 


ty) = ilim( 9.) = (ag,y). 


(9,ay) = ilim(p 
s—0 S s0 =—s 
To prove that ran(a — i) = H, we compute (a—i)y, = —iy, with yw defined by 
(5.346) with n = 1. The property ran(—i) = H is proved in a similar way: now 
define % = jo. dse‘u,w and obtain (a+i), = iy. Thus Lemma 5.74 applies. 
3. Bijectivity has two directions: a> uy, a and uy; ab Uy. 


e Given a and hence (5.341) defining u,, we change notation from a to a’ in 
(5.343) - (5.344) and need to show that a’ = a. Denoting the restriction of 
a to the domain C*(b) by ao, we first show that ag C a’. The technique to 
prove this is similar to the argument around (5.345). We initially assume that 
@ € D(ao) = C*(b)H takes the form @ = h(a) y for some h € C,(o(a)) and 
y € H. Just a trifle more complicated than (5.345), using (5.342), (B.573), 
and unitarity of u;, we estimate: 


esh-h  ., 
oa +1- idg(ryh 


Utr+sP — UP 
Ss 


+ ica |< Iv 


Gi (kK 
salen Bee 
Ss 


) 
= |All -0l] yl 


© 
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so that by definition of the (strong) derivative we obtain 


aL Rs LA (5.348) 


initially for any @ of the said form h(a)y, and hence, taking finite sums, for 
any @ € D(ao). The existence of this limit shows that, on the assumption 
yw € D(a), we have y € D(a’), and we also see that a’ = a on D(ag), or, in 
other words, that ag C a’. Since a’ is self-adjoint (by part 2 of the theorem) and 
hence closed, we have a) C a’. Since ag is essentially self-adjoint by Theorem 
B.159, this gives a C a’. Taking adjoints reverses the inclusion, and since both 
operators are self-adjoint this gives a =a’. 

e Given u; and hence (5.343) - (5.344) defining a, we change notation from u; 
to u{ in (5.341) and need to show that u/) = u,. Indeed, let 


We =u, (5.349) 


and similarly yw) = uj y. If w € D(a), then by definition of a we have 


= = 
es ilim oe ay Saws, (5.350) 


dt s0 S s S 


which also shows that y; € D(a). Similarly, id yj /dt = aw, so that y; and yy 
satisfy the same differential equation with the same initial condition 


yO = (wy = y. 


Now consider W = y; — w/, which once again satisfies the same equation (i.e., 
id, /dt = a), but this time with initial condition Jj = y — (wy)! = 
w-—yw=0. The key point is that any solution % of this equation has the 
property || P|] = || Wo|| for any t € R, since by symmetry of a, 


has Bs soe ae 
Gl Wall = a Ge He) = (Gat) — (ai, Hi) = 0. 


For our specific % we have || || = 0 and hence y; = yy, that is, uw) = uy. 


Corollary 5.75. With t +> u; and a defined and related as in Theorem 5.73, if W € 
D(a), for each t € R the vector y;, defined by (5.349) lies in D(a) and satisfies 
dw, 


ay =i (5.351) 


whence t ++ YW is the unique solution of (5.351) with initial value y =y. 


This follows from the proof of part 3 of Theorem 5.73. With a = h/h (as above), 
this is just the famous time-dependent Schrodinger equation 
dW; 


hy, = ih. (5.352) 
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Notes 


§5.1. Six basic mathematical structures of quantum mechanics 

Wigner’s Theorem was first stated by von Neumann and Wigner (1928), but the 
first proof appeared in Wigner (1931). See Bonolis (2004) and Scholz (2006) for 
some history. Instead of working with #,(H) with the bilinear trace form express- 
ing the transition probabilities, one may also formulate and prove Wigner’s Theorem 
in terms of the projective Hilbert space PH equipped with the Fubini—Study metric, 
in which case the relevant symmetries may be defined geometrically as isometries. 
See Freed (2012) for this proof, as well as Brody & Hughston (2001) for the un- 
derlying geometry. Kadison’s Theorem may be traced back from Kadison (1965). 
See also Moretti (2013). Ludwig symmetries go back to Ludwig (1983); see also 
Kraus (1983). Our approach to von Neumann symmetries was inspired by Hamhal- 
ter (2004), and has a large pedigree in quantum logic. Bohr symmetries were intro- 
duced in Landsman & Lindenhovius (2016), where Theorem 5.4.6 was also proved. 


§5.2. The case H = C” 

This material is partly based on Simon (1976). The covering map (5.46) has a 
nice geometric description: if 2 = C U {co} is the Riemann sphere, we have the 
well-known stereographic projection 


ae: (5.353) 
(x,y,z) 9 — (5.354) 


If u € SU(2) is given by (5.43), then the associated Mobius transformation 


az+B 
—Bz+@ 
is a bijection of £, whose associated transformation of S? is the rotation R = #(u). 


§5.3. Equivalence between the six symmetry theorems 
Most proofs may be also found in Cassinelli et al (2004) or Moretti (2013). 


§5.4. Proof of Jordan’s Theorem 

Our proof of Jordan’s Theorem is taken from Bratteli & Robinson (1987); see 
also Thomsen (1982) for a simplification of the purely algebraic step (which we 
delegated to Theorem C.175), originally proved by Jacobson & Rickart (1950). 


§5.5. Proof of Wigner’s Theorem 

There are many proofs of Wigner’s Theorem, none of them really satisfactory 
(in this respect the situation is similar to Gleason’s Theorem). Our proof follows 
Simon (1976), who in turn relies on Bargmann (1964) and Hunziker (1972). The 
proof in Cassinelli et al (2004) seems cleaner, but their proof of the additivity of 
their operator Tj is not easy to follow. For a geometric approach see Freed (2012). 


188 5 Symmetry in quantum mechanics 


If dim(H) > 3, the conclusion of Wigner’s Theorem follows if W merely pre- 
serves orthogonality (Uhlhorn, 1963). See also Cassinelli et al (2004). This, in turn, 
has been generalized in various directions, e.g. to indefinite inner product spaces 
(Molnar, 2002) as well as to certain Banach spaces, where one says that x is orthog- 
onal to y if for all A € C one has ||x+ Ay|| > ||x|| (Blanco & TurnSek, 2006). 


§5.6. Some abstract representation theory 

Among numerous books on representation theory, our personal favourite is Barut 
& Racka (1977), and also Gaal (1973) and Kirillov (1976) are classics at least for 
the abstract theory. An interesting recent paper on the unitary group on infinite- 
dimensional Hilbert space is Schottenloher (2013). 


§5.7. Representations of Lie groups and Lie algebras 

This section was inspired by Hall (2013) and Knapp (1988). For Lie’s Third The- 
orem, see, for example, Duistermaat & Kolk (2000), §1.14. To obtain Theorem 5.41, 
consider the canonical projection 7 : G + G and define D = #~!({e}). This is a dis- 
crete normal subgroup of G, and it is an easy fact that a discrete normal subgroup of 
any connected topological group must lie in its center. Note that a discrete subgroup 
of the center of G is automatically normal. 

The exponentiation problem for skew-adjoint representations of g is consider- 
ably more complicated than in finite dimension. Let H be an infinite-dimensional 
Hilbert space with dense subspace D and let p : g + L(D,H) bea linear map, where 
L(D,H) is the space of linear maps from L to H. We say that p is a skew-adjoint 
representation of g if (i): D is invariant under wu’ (g), (ii): the commutation relations 
(5.157) hold on D, and (i): each ip (A) is essentially self-adjoint on D. For example, 
we have seen that if wu: G + U(H) is a unitary representation, then the construction 
p(A) =u’ (A), defined on the Garding domain D = Dg, fits the bill. Conversely, ad- 
ditional conditions are needed for p to exponentiate to a unitary representation. The 
best-known of those is Nelson’s criterion: if, given a skew-adjoint representation 
p:g—L(D,H), the Nelson operator or Laplacian A = yams) p(T)? is essen- 
tially self-adjoint on D, then p exponentiates to a unitary representation of G (with 
additional remarks similar to those in Corollary 5.43). 


85.8. Irreducible representations of SU (2) 
§5.9. Irreducible representations of compact Lie groups 

See e.g. Knapp (1988), Simon (1996), and Deitmar (2005), and innumerable 
other books. This material ultimately goes back to (E.) Cartan and Weyl. 


§5.10. Symmetry groups and projective representations 

See Varadarajan (1985), Tuynman & Wiegerinck (1987), Landsman (1998a), 
Cassinelli et al (2004), and Hall (2013). For different proofs of Theorem 5.59 
(Bargmann, 1954) see Simms (1971) and Cassinelli et al (2004). Leaving out the 
anti-unitary symmetries is a pity; see e.g. Freed & Moore and Roberts (2016). 
§5.11. Position, momentum, and free Hamiltonian 
§5.12. Stone’s Theorem 

See Reed & Simon (1972), Schmiidgen (2012), Moretti (2013), Hall (2013), and 
many other books. Our proof of part 1 of Theorem 5.73 is original. 


Part Il 
Between Co(X ) and B(/) 


Chapter 6 
Classical models of quantum mechanics 


This chapter gives an introduction to a chain of results attempting to exclude deeper 
layers underneath quantum mechanics that restore some form of classical physics: 


‘[Such results] more or less illustrate the ways along which some opponents might hope to 
escape Bohr’s reasonings and von Neumann’s proof and the places where they are danger- 
ously near breaking their necks.’ (Groenewold, 1946, p. 454) 


In so far as they are mathematically precise, such no-go results have their roots 
in von Neumann’s 1932 book, which gave rise to two traditions that were often 
in polemical opposition to each other. Mathematically minded authors typically 
admired von Neumann’s exclusion of hidden variables, yet tried to strengthen his 
theorem by weakening its assumptions; this sparked, for example, Gleason’s Theo- 
rem (1957) as well as the Kochen—Specker Theorem (1967). Certain physicists (led 
by Bell), on the other hand, tried to circumvent (and later even ridicule) von Neu- 
mann’s work. A high point of this tradition was Bell’s Theorem from 1964, which 
was informed not only by von Neumann, but even more so by the famous Einstein— 
Podolsky—Rosen (EPR) paper from 1935, as well as by Bohm’s deterministic pilot 
wave reformulation of quantum mechanics (1952). However, at the end of the day 
these traditions turned out to be not really divergent after all: Bell not only indepen- 
dently (and earlier) obtained a version of the Kochen—Specker Theorem, but, more 
importantly, his results from 1964 turn out to be very closely related to the culmina- 
tion of the first tradition in the form of the so-called Free Will Theorem (FWT), which 
was published by Conway and Kochen during 2006-2008. Indeed, although its va- 
lidity is uncontroversial, this theorem has been criticized on the following grounds: 


1. Lack of novelty compared with the famous paper by Bell (1964), whose assump- 
tions and conclusions are at least quite similar to those of the FWT (although the 
underlying proofs are mathematically quite distinct from those in the FWT). 

2. Lack of novelty even within its own terms: versions of the FWT had actually been 
around for decades under less illustrious titles and authorships, e.g. Heywood & 
Redhead (1983), Stairs (1983), Brown & Svetlichny (1990), and Clifton (1993). 

3. Circularity, in that indeterminism is presupposed (namely in the assumption that 
“experimenters have a certain freedom’) instead of derived. 


© The Author(s) 2017 191 
K. Landsman, Foundations of Quantum Theory, 
Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3_6 


192 6 Classical models of quantum mechanics 


One aim of this chapter is to clarify these matters, with the following conclusions: 


1. The difference between earlier literature in the same direction and the FWT is 
largely one of emphasis, namely on free will (!), exemplifying a recent trend 
(also found elsewhere) in emphasizing free choice of the settings of experiments. 
Unfortunately, like Bell, Conway and Kochen even mathematically use an infor- 
mal way of talking about free settings, not to speak of the complete absence of 
any serious philosophical analysis of free will among all three authors (for which 
perhaps Bell, but certainly not Conway and Kochen may be excused). 

2. Granting the informal characterization of free settings, both Bell’s (1964) The- 
orem and the FWT establish a contradiction between quantum mechanics, deter- 
minism, and locality (in the sense of Bell, which in the presence of determinism 
reduces to a no-signaling condition called parameter independence). 

3. The technical difference between Bell’s Theorem and the FWT lies in four facts: 


a. Bell’s arguments rely on probability theory (whereas the FWT does not). 

b. The (optical) corner of quantum mechanics used in Bell’s Theorem may be 
replaced by the corresponding experimental results, whereas the FWT uses 
uncontroversial yet untested predictions about massive spin-1 particles. 

c. The FWT must assume perfect (EPR) correlations, which are difficult to realize 
and hence are avoided by later versions of Bell’s Theorem (i.e. through the 
CHSH inequalities rather than the original Bell inequalities). 

d. Like EPR, Bell and his followers focused on locality right from the begin- 
ning, and hence in Bell (1964) the inference is from locality to determinism. 
Conway and Kochen, on the other hand, resolve the contradiction their FWT 
established by inferring randomness of outcomes from freedom of settings. 


We start with a very simple treatment of both von Neumann’s argument against 
linear hidden variables and Kochen & Specker’s refinement of it, in which von Neu- 
mann’s controversial linearity assumption is decisively weakened so as to only apply 
to commuting operators; the Kochen—Specker Theorem excludes what are called 
non-contextual quasi-linear hidden variables. We then present what we see as a 
more transparent version of the FWT, whose key ingredient of replacing the non- 
contextuality assumption in the Kochen—Specker Theorem by a locality condition 
is preserved, but where this time the setting is completely deterministic. Freedom 
of choice then arises as a very natural independence assumption, and any threat of 
circularity is avoided: the conclusion is simply a contradiction between determin- 
ism, freedom of choice (i.e. of apparatus settings), locality, and quantum mechanics. 
Moreover, as we argue in 86.3, the philosophically precise concept of free will used 
in the assumptions of the FWT is what Lewis coined ‘local miracle compatibilism’ . 

Following an interlude on the GHZ Theorem, which seamlessly fits into the given 
framework, we then turn to Bell’s Theorems, which we compare with the FWT. 

Finally, we give our own rigorous version of an argument first proposed by Col- 
beck and Renner to the effect that, under suitable freeness of choice and no-signaling 
conditions (similar to those in Bell’s Theorem and the FWT), as long as they are 
compatible with quantum mechanics, hidden variables are at best irrelevant. In fact, 
this can only be proved under much stronger assumptions, obscuring the claim. 
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6.1 From von Neumann to Kochen-Specker 


Von Neumann’s Theorem 6.2 below was the first technical result excluding some 
class of hidden variables underneath quantum mechanics, namely (in current par- 
lance) linear non-contextual hidden variables. This terminology requires some ex- 
planation. First, theorems of this kind apparently accept the mathematical structure 
of the observables prescribed by the usual formalism of quantum theory, i.e., ob- 
servables are identified with elements of the self-adjoint part 


H,,(C) = Mn (C)sa = {a € M,(C) | a= a} (6.1) 


of the algebra M,,(C) of n x n matrices (this simple case suffices to make all points 
of conceptual interest). Short of introducing “hidden” observables, hidden variable 
theories propose the existence of hidden states, which either replace or supplement 
the usual quantum states (which in the case at hand would be density operators). 
Mimicking classical (statistical) physics, such states are interpreted as probabil- 
ity measures on some phase space X, whose points x € X assign sharp values to 
quantum-mechanical observables. Naively, this is done through associated functions 


Vi: Hn(C) > R, (6.2) 


but in fact this choice already commits us to the first of two possibilities, which we 
pragmatically present as theories predicting measurement outcomes: 


e In non-contextual deterministic theories of measurement, the outcome solely 
depends on the observable a that is being measured and on the (possibly ‘hidden’) 
state of the system. Theorem 6.2 below, then, rules out such theories in which 
values are sharp (i.e., dispersion-free), and V; in (6.2) is linear. The Kochen— 
Specker Theorem subsequently proves the same impossibility under a weaker 
(and physically more reasonable) assumption called quasi-linearity. 

e Contextual deterministic theories of measurement, on the other hand, allow the 
outcome of some measurement of a to depend on the measurement context (as 
well as on the state), which in this case is understood as the choice of possible 
other (compatible) observables b measured together with a (i.e., ab = ba). This 
seems a reasonable assumption, well within the spirit of quantum mechanics, 
though perhaps not so in the extreme form later held by Heisenberg, according 
to which measurement outcomes (or even “reality”) are “created” by the mea- 
surement. Under a weakened non-contextuality assumption, Bell’s Theorem (cf. 
86.5) and the Free Will Theorem (86.2) rule out such theories, too. 


Definition 6.1. A non-contextual hidden variable is a map V : H,(C) > R that 
for each a € H,(C), and in terms of the n x n unit matrix 1y, satisfies 


V(a") = V(a)’; (6.3) 
VAN=E (6.4) 


That is, V is dispersion-free as well as normalized, respectively. 
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Theorem 6.2. For n > 2, non-zero linear dispersion-free maps V : H,(C) — R do 
not exist. In particular, linear non-contextual hidden variables do not exist. 


Proof. Such maps extend to complex-linear dispersion-free maps V : M,(C) > C 
by complex linearity, so that theorem is equivalent to Proposition 2.10. 


As von Neumann perfectly well understood himself, his seemingly natural linear- 
ity assumption (given the mathematical structure of quantum mechanics unearthed 
by none other than he!) is unwarranted physically (and even mathematically, since 
eigenvalues and eigenstates, which should be the hallmark of dispersion-free states, 
are by no means linear in the underlying operator). This suggests the following: 


Definition 6.3. A map V : H,(C) — R is called quasi-linear if for all s,t € R and 
all a,b € H,(C) that commute (i.e., ab = ba) one has 
V(sa+tb) = sV(a)+tV(b). (6.5) 


As in the linear case, such a map uniquely extends to a map V : M,,(C) — C that is 
precisely a quasi-state in the sense of Definition 2.26. The following lemma will be 
useful, also showing that the above objections to linearity have been met. 


Lemma 6.4. Let V : H,(C) > R be a quasi-linear non-contextual hidden variable. 


1. For each a € H,(C), the number A = V(a) is an eigenvalue of a. 
2. If (a1,...,@x) pairwise commute, and b = f(ay,...,a,) for some polynomial f, 
then V(b) = f(V(a1),..-,V (ax)). 


More generally, it follows from Theorem C.24 that if H is a Hilbert space and V : 
B(H)sa — R is a quasi-linear non-contextual hidden variable (or, equivalently, its 
complexification Vc : B(H) — C is a dispersion-free quasi-state), then V(a) € o(a) 
(provided a* = a). This implies the above lemma, but we also provide a direct proof. 


Proof. For any b € H,,(C) with ab = ba, eq. (6.3) and quasi-linearity imply that 


V(ab) =V(a)V(b): (6.6) 


just evaluate V((a+b)*) = (V(a) +V(b))?. Taking b = a? etc. and also invoking 
(6.4) then yields V(p(a)) = p(V(a)) for any polynomial in a. If A; are the eigenval- 
ues of a, its characteristic polynomial p(a) =[T/_, (a— Aj) satisfies p(a) = 0, so that 
V(p(a)) =0 and hence p(V(a)) = 0, or [J], (A —A;) = 0. This implies that A = A; 
for some i. The second claim is proved in a similar way. 


Theorem 6.5. For n > 3, quasi-linear non-contextual hidden variables do not exist. 


This is the Kochen—Specker Theorem. It follows from Gleason’s Theorem 2.28 and 
von Neumann’s Theorem 6.2, since according to Corollary 2.29 to the former, quasi- 
states on M,,(C) are actually states (in other words, quasi-linear non-contextual hid- 
den variables are linear). However, Kochen and Specker also gave a direct proof of 
their theorem, subsequently somewhat simplified along the following lines. 
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Proof. We prove the claim for n = 3, which (by restricting V to any self-adjoint 
subalgebra of M,,(C) isomorphic to H3(C)) implies the result for all n > 3 also. To 
prove Theorem 6.5 for n = 3, we interpret H3(C) as the algebra of observables of a 
spin-1 particle and introduce the well-known angular momentum matrices 


00 0 001 0-10 
=| 00-i1],h=] 000], 4=17100]. (6.7) 
07 0 —i00 000 
In what follows, we will heavily use the squares 
000 100 100 
R=(010)],8=|000|,82=] 010], (6.8) 
001 001 000 


each of which has eigenvalues 0 and 1. The Ri commute by inspection, and satisfy 
RAT +R = 2-13. (6.9) 

The (matrix-valued) angular momentum vector is given by 
J = Je; + Joep + J3e3, (6.10) 


where (e1,€2,€3) is the standard basis of R? (seen as a vector space with the usual 
inner product (-,-)), ie., e1 = (1,0,0), etc., and the angular momentum J, along an 
arbitrary unit vector u = Yue; in R? is given by 


3 
Jy = (Sou) = Yui. (6.11) 
i=1 


This brings us to the crucial point: a map V : H3(C) — R induces a map V :S* +R 
on the set S? of all unit vectors u in R?, via 


V(u) =V(J2). (6.12) 
As usual, a basis of R3, denoted by a = (u,, U2, U3), is always assumed orthonormal. 


Lemma 6.6. Let V : H3(C) > R be a non-contextual quasi-linear hidden variable, 
with associated map V : S* — {0,1} given by (6.12). Then: 


1. V(—u) =V(u) for each u € S? (so that V is defined on the real projective plane); 
2. If a = (uy, U2, us) is a basis, then the triple V(a) = (V(u1),V(u2),V (u3)) must 
contain a single 0 and two 1’s, i.e., V(a) must be one of the triples 
A= (Oi); 
2?) = (1,0,1); 
A) = (1,1,0). (6.13) 
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In Gleason-like language, V is a 2-valued frame function of weight w(V) = 2. 


Proof. If a = (uy,U2,U3) is a basis, then Jy, = uJju* for i = 1,2,3, where uw is the 
3 x 3 matrix with entries uj; = (uj,e ne Since u is unitary, the matrices Jy, and their 
squares have the same eigenvalues and satisfy the same relations as the J; and their 
squares. Thus the eigenvalues of Ji, are 0 and 1, for fixed a the squares de. mutually 
commute, and they satisfy the sum rule (6.9), 1.e., Ji +Je, +Ji, =2-13,s0 V(u1) + 
V(u2) + V (us) =2. The claim then follows from Definition 6.3 and Lemma 6.4. 


Now define a coloring of R° as any map V : S? > {0, 1} satisfying the two properties 
in Lemma (6.6). The proof of Theorem 6.5 then reduces to the following lemma. 


Lemma 6.7. There exists no coloring of R°. 


Proof. Take the following unit vectors (some identical), grouped into 11 bases (for 
simplicity we use unnormalized vectors, e.g., (1,0, 1) stands for (1//2,0,1/V2)): 


basis uy Uo U3 

a, (0,0,1) (1,0,0)  (0,1,0) 
a2 1,0, ne (—1,0,1) (0,1,0) 
a3 ,1) (0,-1,1) (1,0,0) 
a4 he a (—1,1,2) (1,1,0) 
as 1,0,2) (—2,0,1) (0,1,0) 
ao (2,1,1) (0,-1,1) (-2,1,1) 


) 

) 

az (2,0,1) (0,1,0) (—1,0,2) 
ag (1,1,2) (1,—1,0) (—1,-1,2) 
ag (0,1,2) (1,0,0) (0,—2,1) 
aio (1,2,1) (—1,0,1) (,-2,1) 
ay (1,0,0) (0,2,1) (0,—1,2). 


We will show that one cannot even color this particular finite set of vectors (let alone 
all unit vectors in IR*). We denote a vector u; in a basis ay by 


ul) 7=1,2,3,=1,...,11, 
and write e.g. V(ay,) = (0,1, 1) for the three conditions 
vu) =0, Hu) =1), Vu) =1. 


The main point is that if some coloring V maps a specific vector u to 0, then all 
vectors orthogonal to u must go to 1. In particular, two orthogonal vectors can never 
both be sent to 0. To find a contradiction (to the assumption that V exists), we try 
to assign values veut" )) one after the other, starting in row |. Here some specific 
choices will be made, but by symmetry other choices lead to similar contradictions. 
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1. Suppose that V (a1) = (0,1,1) (ie., vu”) = 0 and V(us”) = V(us”) =1). In 
a2 this forces V(uy) = |, so that either a”) or us? must be mapped to 0 (and 
the other to 1). Let vu”) = 0, so that vu) = l,ie., V(u2) = (0,1,1). Ina3 


one has us? = us), so Vu) = 1. We choose vu) = 0 and hence Vus) = 


1, so V(u2) = (0,1,1). In ag, the vector us) is orthogonal to ul”), which has 


been mapped to zero already, so that Vu?) = |. The remaining free choice is 
arbitrarily made as vu) = 0), so that V(us) = | and hence V (a4) = (0,1, 1). 


2. But now everything is fixed for a5 t/m aj,, as follows. From as, the vector uy? 


: 5). 4 
already occurred in u;, and moreover, us ) is orthogonal to ul ) from a4. Be- 


cause vu) = 0, one must have ¥(us) = |. And so on and so forth, yielding 
V (ay) = (0,1,1) voor up =5,...,10 (as was the case also for = 1,2,3,4). 


3. In ay, one has ul!) = us”, so ul!) is mapped to 1. Furthermore, us!) is or- 


thogonal to a), which was mapped to 0; hence us!) goes to 1. Finally, us!) is 


orthogonal to ull), which was mapped to 0, so that ul) must go to 1. Thus 


Vian) = (1, 1,1). (6.14) 


But (1,1, 1) is not an admissible value of V! So V and hence V cannot exist. 


Corollary 6.8. There is no function V with the two properties stated in Lemma 6.6. 


The Kochen—Specker Theorem is often stated in the following way. 


Definition 6.9. For any finite-dimensional Hilbert space H, a coloring of the set 
PY (H) of one-dimensional projections on H is a function 


W: D(H) > {0,1} 


such that for any resolution of the identity (e;) with e; € P\ (HA), i.e., 


ejej = 656i; (6.15) 
\e;= ly; (6.16) 
i 
one has 
y Wei) =1, (6.17) 


so that there is exactly one member e; of the family such that W(e;) = 1. 


Note that if e € Y\(H) then e = ey = |y)(w| for some unit vector y € H, so 
that each basis (v;) of H defines such a family by e; = |v;)(v;|, and vice versa, 
up to phase factors. The setting of Gleason’s Theorem is similar, with the crucial 
difference that the function on Y)(H) in question then takes values in [0, 1] instead 
of {0, 1} and hence can be shown to exist, even amply so (as there are many states). 
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Theorem 6.10. If dim(H) > 2, there exists no coloring of P\(H). 

Proof, For H = C?, the existence of W would yield the existence of V through 
V(u) =1—W(eu), (6.18) 


where u € R? is regarded as a vector in C>. Property 1 in Lemma 6.6 is obviously 
satisfied. To prove property 2, we note that for any unit vector u € R? C C3, we have 


Jeu =0, (6.19) 
since an explicit computation based on (6.11) shows that, with u = (u1,u2,u3), 


2 2, 

. U5 +3 —UjU2 —U1U3 

Jg= | —uiu2 ua + uy —unu3 |. (6.20) 
—U,U3 —U2U3 uy + us 


It follows from rotation invariance that the eigenvalues of JZ are the same as those 
of each J?, cf. (6.8), i.e., A = 0 with multiplicity one and A = 1 with multiplicity 
two. Hence (6.19) gives the projection eg onto the eigenspace of Je for A = Oas 


eo = |u) (ul Sen. (6.21) 


Property 2 in Lemma 6.6 then follows from the assumption that W is a coloring. 
Since V cannot exist by Lemma 6.7, neither can W. This proves the claim for C?. 
We finish by induction. Suppose C” contains some set {u,}zcx of unit vectors 
that cannot be colored, assuming that uo = (1,0,...,0) lies in this set. We embed 
each u, into C’*! by adding a zero at the end, calling the image u,. Adding v = 
(0,...,0,1), the only possible coloring of the set {u),,v}xcx in C”"! is given by 
W (u,,) = 0 for each k € K and W(v) = 1. Indeed, if W(uj, ) = 1 for some ko, then, 
since v is orthogonal to each u,, we must have W(v) = 0, which means that the 
original set {uy },ex should be colorable in C”, but this is impossible by assumption. 
We now embed each u, into C’*! by adding a zero at the beginning, denoting its 
image by uj, and add up = (1,0,...,0,0). By the same token, the only coloring of 
the set {u/,up}xex is given by W(u) = 0 for each k € K and W(up) = 1. But this 
leaves the set {u,,u/,Vv}xcx in C’*! uncolorable, since colorability of {uj,Vv},ex 
gave W (up) = 0, whereas colorability of {u//,up }icx gave W(up) = 1. 


The set thus obtained is larger than necessary. For example, already for H = C+ the 
following bases cannot be colored (again writing down unnormalized vectors): 
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basis uy Vo) UB u4 


(0,0,0, 1) 

(0,0,0, 1) 

a3 (1,-1,1,-1) (1,-1,-1,1) 

a4 (1,-1,1,-1) fee 

as (0,0,1,0) — (0,1,0,0) 
( ) 
1 


0,0, 1,0) 
0,1,0, ® 


1,1,0,0) (1,—1,0 a 
1,0,1,0) (1,0,—1,0) 
1,1,0,0) (0, a 
:0;—=1;0) 0 ,1,0,=1) 


( ( 
( ( 
( 
1 
(1,0,0,1) (1,0,0,—1) 
1 
1 
( 
( 


( 


a (1,-1,-1,1) (,1,1,1) (,0,0,—1) (0,1,-1,0) 
ay (1,1,-1,1) (,1,1,-1) ( 
ag (1,1,-1,1) (-1,1,1,1) 
ag (1,1,1,-1) (—1,1,1,1) 


,—1,0,0) (0,0, 1,1) 
1,0, 1,0) (0, 1,0, —1) 
1,0,0,1) (0,1,—1,0) 


The proof is the following observation: if we present the coloring condition as 


W(0,0,0,1) +W(0,0,1,0) +W(1,1,0,0) +W(1,-—1,0,0) = 1; (a)) 
(de) 
W(1,1,1,-1)+W(-1,1,1,1) +W(1,0,0,1) +W(0,1,—1,0) = 1, (ag) 


then since there are nine such equations the sum of the right-hand sides is odd, 
whereas the sum of the left-hand sides is even, since each vector appears twice. 

To bridge the gap between the Kochen—Specker Theorem and the Free Will The- 
orem, as well as the one between mathematics and physics, we now rephrase the 
former as a “mini FWT”. We build an experiment consisting of a box containing a 
spin-1 particle and a device capable of measuring all of the three observables 


(Ja, ae an ) 


for an arbitrary basis a of R*; since the operators in question commute, this si- 
multaneous measurement is allowed by quantum theory. The choice of a is called 
the setting of the experiment, traditionally denoted by A (in honor of Alice, who is 
supposed to perform the experiment), with possible values A = a. In “phenomeno- 
logical” notation, the observable measured in an experiment like this is called F, 
which in the case at hand has three components F = (F\, F, F3): given the setting a, 
the observable F; corresponds to tae The notation F = A for A = (A4,A2,A3), ie 
F; = A;, then expresses the fact that the outcome of a measurement of F is A. 

According to both quantum mechanics and our quasi-linear non-contextual hid- 
den variable theory, either A; = 0 or A; = 1, and A must lie in the value space 


A = {(0,1,1),(1,0, 1), (1,1,0)}; (6.22) 


cf. Lemma 6.6 for the hidden variable theory, while in quantum mechanics (6.22) 
follows from the fact that A must lie in the joint spectrum of the three operators Jae 
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This, in turn means that there must be a joint eigenvector y such that Je =A 
for each i = 1,2,3. There are three such joint eigenvectors, namely uy, Uy, and 
uz (initially defined as vectors in R* but now seen as vectors in C%), with joint 
eigenvalues (0, 1,1), (1,0,1), and (1,1,0), respectively. 

Otherwise, quantum mechanics and our quasi-linear non-contextual hidden vari- 
able theory provide a different picture of the experiment. According to the former 
theory, a given spin-1 particle may be prepared in a (pure) quantum state y, which 
is a unit vector in C?. Quantum theory then merely predicts probabilities 


Py(F =A\|A =4)= PR IR, (Ai, A2,A3), (6.23) 
for the possible outcomes A, which according to the Born rule (2.21) are given by 
Py(F =A\A =a) = |(u,,y)|. (6.24) 


So if y = u;, then the outcome will be 2 = A with probability one, but in a super- 
position y = Y; cju; (with Y; |c;|? = 1), quantum theory predicts a random sequence 
of outcomes A“), each with probability |c;|?. 

Let us note that quantum mechanics is non-contextual in the following (proba- 
bilistic) sense. Alice could decide to perform just one measurement instead of three, 
say F\, with setting aj = uw, or perhaps she may not know if the other two are 
performed. Fortunately, this does not matter, since for any unit vector y € C?, 


Py(Fi = 44|A1 =u1) = Yo Py(F =AlA =a), (6.25) 
An,A3 


so that according to quantum mechanics, it does not matter for the Born probabilities 
of the first measurement if the other two are performed or not. 

The question now arises if some quasi-linear non-contextual hidden variable the- 
ory theory could improve on this, in that the probabilities quantum theory assigns 
to various outcomes are replaced by predictions. In the sprit of determinism (whilst 
avoiding the appearance of circularity), such a theory should also predict the settings 
of the experiment. Accordingly, the assumptions leading to our “mini FWT” are: 


Definition 6.11. In the context of the experiment on spin-1 particles just discussed: 


e Determinism firstly means that there is a state space X with associated functions 


A:X 4X4; (6.26) 
F:x-> A, (6.27) 


where Xz is the set of all bases in R? (i.e. a € X4), and A is some set of possible 
outcomes; these functions completely describe the experiment in the sense that 
each state x € X determines both its settings a = A(x) and its outcome A = F (x). 
Here A = (A,A2,A3), where the functions A; : X > Ss? (seen as the space of unit 
vectors in R3) combine to define a basis, and F = (F,,F),F3), where F;:X > R. 
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Secondly, there exists some set Xz and an additional function 
Z:X — Xz, (6.28) 


such that 
F=F(A,Z). (6.29) 


More precisely, for each x € X one has 
F(x) = F(A(x),Z()) (6.30) 


for a certain function F : X, x Xz — A. Also this function is, of course, a triple 
F = (Ff, f,f3), where F;: X4 x Xz — 2. In terms of (6.28), then: 

e Nature then requires that A is given by (6.22) (so that F; : X — 2). 

e Freedom states that A and Z are independent in the sense that the function 


AXZ:X > X,4 x Xz 
x ++ (A(x),Z(x)) (6.31) 


is surjective; in other words, for each (a,z) € X4 x Xz there is an x € X for which 
A(x) =a and Z(x) =z (making a and z free variables). 
e Non-contextuality (cf, Lemma 6.6) finally stipulates that F take the form 


F((u),u2,u3),2) = (F(u1,z),(u2,z),F(us,z)), (6.32) 
for a single function F : S? x Xz — 2 that also satisfies 
F(—u,z) = F(u,z). (6.33) 


“Nature” may be taken to be either an experimental result or an uncontroversial 
prediction of (some corner of) quantum mechanics. The function Z (including its 
domain Xz) describes anything relevant to the experiment (such as the behaviour of 
the particle) except the variables determining the settings (which do form part of 
X). The goal of the freedom assumption is to remove any potential dependencies 
between the variables (a,z), and hence between the physical system Alice perform 
her measurements on, and the devices she performs her measurements with. 


Corollary 6.12. Determinism, Nature, Freedom, and Non-contextuality are contra- 
dictory. 


Proof. For each z € Xz, define a function V, : S* + 2 by V,(u) = F(u,z). The as- 
sumptions combine to give V, the same properties as V in Lemma 6.6 (where z 
“goes along for a free ride”). According to Corollary 6.8 (which applies because by 
Freedom one can freely vary a for any given z), the function V, cannot exist. 


This “mini FWT” is a good exercise for the Free Will Theorem in the next section. 
For example, let us note, as a warning, that if Determinism is seen as the culprit (and 
hence falls), then the other assumptions in the (min) FWT are no longer defined. This 
blocks a direct inference from Freedom to Indeterminism 4 la Conway & Kochen. 
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6.2 The Free Will Theorem 


The Free Will Theorem is similar in spirit to Corollary 6.12, with the difference that 
the experiment now has two wings and the non-contextuality assumption is replaced 
by a certain locality condition. This condition relates to the setting introduced by 
Einstein, Podolsky, and Rosen in 1935 and further studied by Bohm, Bell, and oth- 
ers, in which (in current jargon) two physicists, called Alice and Bob, are far apart 
whilst performing simultaneous experiments on some correlated two-particle state 
(technically speaking, their measurements need to be spacelike separated). In the 
situation considered by EPR each particle had a spatial degree of freedom and hence 
required the infinite-dimensional Hilbert space L*(IR*) for its description, but, as 
recognized by Bohm, the thrust of the argument comes out more clearly if each 
particle merely has an internal degree of freedom (and is “frozen” otherwise). 

Bell (1964) considered a pair of spin 5 particles (cf. $6.5), each of which has 
Hilbert space C? (although the famous experiments of Aspect testing the violation of 
Bell’s inequalities used photons, which have the “same” Hilbert space), but because 
of its reliance on the Kochen—Specker Theorem (which fails for C*) the Free Will 
Theorem requires one dimension more, i.e., H = C3. As before, we see this as the 
state space of a massive spin-! particle. The price of this extra dimension is that 
the pertinent experiment whose outcome provides the Nature input for the Free Will 
Theorem has not actually been performed, but, as in the Bell case, the predictions 
of quantum mechanics are uncontroversial and will serve as input instead. 

These predictions are as follows. Alice and Bob measure on the correlated state 


Wo = (€1 Be; +e. Ben +€3 e3)/V3, (6.34) 


where we recall that (e;,e2,e3) is the standard basis of R°, now seen as a basis of 
C?. This state is rotation-invariant, which means that nonzero angular momentum in 
one particle must be compensated for in the other, creating the desired correlations. 

As before, we denote Alice’s setting by A = a, which remains the choice of some 
basis of R?, but this time also Bob picks some basis b, so that we write B = b for his 
choice. Similar to Alice’s outcome F = A we denote Bob’s by G = ¥, and quantum 
mechanics provides all (Born) probabilities 


Py (F A,G yA a,B b) = Fae Jig la Se, He Ses (A1,A2,43,.N5%2,%3); 


which are well defined because Alice’s squared angular momentum operators Ji, 
commute with Bob’s Fi, as a consequence of Einstein locality (stating that spacelike 
separated observables commute). Note that similarly to a = (u;,U2,u3) for Alice’s 
basis, we write b = (v1,V2,v3) for Bob’s. If Alice merely measures F; whilst Bob 
measures Gj, then, as in the previous section, it does not matter which other (com- 
muting) operators are measured and/or whether Alice and Bob know about this, cf. 
(6.25). Thus we may write either (A = a,B = b) or A; = u;,B; = v; for the settings, 
and simple calculations show that the Born probabilities are given by: 
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Pyp(F = 1,G; = 1|A =a,B=b) = 4(1+ (uj, v;)); (6.35) 
Py (Fi =0,G; =0|A =a,B =b) = 3 (uj,v;)"; (6.36) 
Pyy(F; = 1,G; =0|A =a,B =b) = 4(1— (uj,v;)”); (6.37) 
Py) (Fi =0,G; = 1|A =a,B =b) = 1(1—(ui,v;)”), (6.38) 


where (u;,v;)” =|(u;,v;)|”, etc., since the vectors are real, In terms of the notation 


Py (Fi = Gj|-) = Py (Fi = 0,Gj = 0|-) + Pyy(Fi=1,Gj=1]-); 6.39) 
Py (Fi # Gil +) = Py (Fi =0,Gj =1|-)+ Py (Fi =1,G; =0|-), (6.40) 


this yields 


Py, (Fi =Gj|A =a,B =b) = 1(1+2(uj,v;)”); (6.41) 
Py (Fi # G;|A =a,B =b) = 2(1—(uij,v;)”). (6.42) 


The crucial point for the Free Will Theorem is that this implies perfect correlation: 
Py (Fi = G;|Ai = Bj) = 1, (6.43) 


in agreement with the intuition about angular momentum expressed earlier. 

We now move to a (possibly counterfactual) deterministic description of this ex- 
periment along the lines of the previous section. It is straightforward to adapt all 
of Definition 6.11 except Non-contextuality (which after all is the assumption we 
would like to get rid of!). With the obvious changes, we obtain: 


e Determinism again first claims there is a state space X with associated functions 


A:X 3X4; (6.44) 
B:X — Xp; (6.45) 
F:xX 7A; (6.46) 
G:xX >A, (6.47) 


where X4 = Xp is the set of all bases in R?, and A is some set of possible 
outcomes, which completely describe the experiment in the sense that each 
state x € X determines both its settings (a = A(x),b = B(x)) and its outcome 
(A = F(x), y= G(x)). Here A = (A1,A2,A3) and B = (B, Bz, B3) where the func- 
tions A; : X > S? (where S? is seen as the space of unit vectors in R?) combine 
to define a basis (similarly for B; : X S*), and F = (F,, Fy, F3). Secondly, there 
exists some set Xz and an additional function Z : X — Xz such that 


F =F(A,B,Z); (6.48) 
G = G(A,B,Z), (6.49) 


in that for each x € X one has the functional relationships 
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F(x) = F (A(x), B(x),Z(x)); (6.50) 


G(x) = G(A(x), B(x), Z()), (6.51) 


for certain functions F : Xa X Xp x Xz > A and G: X, X Xp x Xz — A, each 
of which is a triple F = (F\,/5,F3) with F; : X4 x Xp x Xz > R, etc. The value 
z = Z(x) is just the traditional “hidden variable” (which is often denoted by A). 
e Freedom then states that A, B, and Z are independent in that for each (a,b,z) € 
X, x Xp x Xz there is an x € X for which A(x) =a, B(x) = b, and Z(x) =z. 
e Nature requires that: 


— A is given by (6.22), i.e. F; and G;, and hence F, and Gj take values in {0,1}; 
— The experiment measures squares of angular momenta, so that 


F(d,b',z) = F(a,b,z); (6.52) 
G(d',b’,z) = G(a,b,2), (6.53) 


whenever (a’,b’) differ from (a,b) by changing the sign of any basis vector; 
— Perfect correlation obtains, cf. (6.43), i.e., writing a = (u,,U2,uU3) for Alice’s 
basis and b = (vj,V2,v3) for Bob’s, one has 


u; =v; > Fi(a,b,z) = Gj(a,b,z). (6.54) 


We now come to the locality condition that is to replace Non-contextuality. This 
condition was first clearly stated by Bell (1964, p. 196), who attributes it to Einstein: 


‘The vital assumption is that the result G for particle 2 does not depend on the setting a of 
the magnet for particle 1, nor F on b, 


Noting various other notions of locality (such as Einstein locality in local quantum 
physics, which requires spacelike separated operators to commute, or Bell locality, 
discussed below), the above idea might be called Context locality, but we will simply 
refer to it as Locality. In our deterministic setting, a precise formulation is this: 


e Locality means that F(A,B,Z) is independent of B and G(A, B,Z) is independent 
of A. In other words, we have F = F(A,Z) and G = G(B,Z), so that (with slight 
abuse of notation) F : X4 x Xz > A and G: Xz x Xz — A, or, then again, F(x) = 
F(A(x),Z(x)) and G(x) = G(B(x),Z(x)), for each x € X. 


This finally brings us to (our reformulation of) the Free Will Theorem: 


Theorem 6.13. Determinism, Freedom, Nature, and Locality are contradictory. 


Proof. The Freedom assumption allows us to treat (a,b,z) as free variables, a 
fact that will tacitly be used all the time. First, taking i = j in (6.54) shows that 
F(u;,u2,u3,z) only depends on (u;,z), whilst G;(v1,¥2,V3,2) only depends on 
(v;,z). Hence we write F;(a,z) = F;(uj,z), etc. Next, taking i 4 j in (6.54) shows 
that Fy (u,z) = F,(u,z) = F(u,z). Consequently, the function F : X4 x Xz > Xp is 
given by (6.32). We are now back to the proof of Corollary 6.12, concluding that 
such a function does not exist by Corollary 6.8. 
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6.3 Philosophical intermezzo: Free will in the Free Will Theorem 


‘The determinism-free will controversy has all of the earmarks of a dead problem. The 
positions are well staked out and the opponents manning them stare at each other in mutual 
incomprehension.’ (Earman, 1986, p. 235) 


The question arises which specific notion of free will is among the assumptions of 
the FWT (in the reformulation just given). To put this question in perspective, let 
us briefly recall the main point of the debate about free will. This concept has two 
poles. One is the “will” itself, requiring a sense of agency, deliberation, and control. 
This pole seems to require some form of determinism. A powerful expressions is: 


‘Fiirst! Was Sie sind, sind Sie durch Zufall und Geburt. Was ich bin, bin ich durch mich.’! 
(Beethoven, to his benefactor (!) Prince Lichnowsky) 


The other pole of free will is the adjective “free”, i.e., the ability to do otherwise, 
which at first sight requires indeterminism. The problem of free will is that these 
poles seem contradictory. Many authors conflate free will with moral responsibility: 


‘free will can be defined as the unique ability of persons to exercise control over their 
conduct in the manner necessary for moral responsibility.’ (McKenna & Coates, 2015) 


This aspect is irrelevant to our discussion, concerned as it is with the question what 

it would mean for Alice and Bob to choose their settings “freely” if determinism is 

assumed (it would have been different if one setting launched a nuclear missile). 
Even in our narrow context, the traditional philosophical stances are relevant: 


e Compatibilism denies the contradiction, claiming that free will and determinism 
coexist. This position may be defended in many ways, among which one finds: 


— Reconceptualizing “the ability to do otherwise” in a deterministic world. This 
will be our focus in what follows, especially in a version inspired by Lewis. 
— Belittling the relevance of “the ability to do otherwise”, as e.g. by Dennett: 


‘So if anyone at all is interested in the question of whether one could have done 
otherwise in exactly the same circumstances (and internal state) this will have to be 
a particularly pure metaphysical curiosity—that is to say, a curiosity so pure as to be 
utterly lacking in any ulterior motive, since the answer could not conceivably make 
any noticeable difference to the way the world went.’ (Dennett, 1984, p. 559). 


e Incompatibilism accepts the contradiction, once again branching off into: 


— Libertarianism, arguing that free will requires an indeterministic world. 
— Hard determinism, claiming determinism (which is assumed) blocks free will: 


‘Ein Mensch kann zwar tun was er will, aber nicht wollen was er will.? 
(Schopenhauer) 
— Hard incompatibilism, asserting that ‘every way you look at it you lose’: 
free will makes no sense in either a deterministic or an indeterministic world. 


' 4 ord! What you are, you are through chance and birth. What I am, I am because of myself.’ 
2 ‘One can admittedly do what one wants, but one cannot want what one wants.’ 
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Although hard incompatibilism has our sympathy, our opening question con- 
cerning the notion of free will in the FWT drives us into the compatibilist direction, 
since determinism is among the assumptions shown to be contradictory by Theo- 
rem 6.13. Within compatibilism, we will be close to the well-known ‘local miracle’ 
variant thereof proposed by the philosopher David Lewis. Like other compatibilists 
before him (starting at least with G-.E. Moore), Lewis attempts to make sense of the 
intuition that even in a deterministic world one in principle has the ability to act 
differently from the way one actually does, despite the fact that the latter was pre- 
determined. A simple example is Alice’s choosing setting a by moving her hand in 
a certain way, although she was able to choose a’. On the other hand, she could not 
have moved her hand with a speed greater than that of light, so her ability remains 
constrained by the laws of nature. Lewis asks us to distinguish between: 


e ‘Iam able to do something such that, if I did it, a law would be broken.’ 
e ‘Iam able to break a law,’ 


The latter is impossible, but the former is not on Lewis’s own theory of counterfac- 
tuals, according to which the phrase ‘if I did it’ leads us to consider the possible 
world in which doing ‘something’ is actually true, whilst in the possible worlds 
under consideration as many other features as possible are kept the same as in the 
actual world (the precise underlying measure of similarity is not important here). 
Thus the phrase ‘a law would be broken’ refers to the laws of the actual world (in 
which the alternative action is not realized). It seems to be of great importance to 
Lewis that in the first case it is not the agent who would break a law; instead, it is the 
breaking of some law of our actual world at an earlier time that enables the subject 
to do in an alternative possible world what she could not do in our actual world, . 

By making this distinction, Lewis claims that he invalidates the seemingly lethal 
Consequence Argument against compatibilist free will, of which a simple version 
reads (assuming determinism, on which compatibilist free will is predicated): 


1. Alice’s actions are a necessary consequence of the laws of nature plus the state 
of the universe (or the relevant part thereof) at any earlier time; 

2. Alice is unable to render both (laws and earlier states) false; 

3. Alice is unable to render the consequences of laws and earlier states false; 

4. Ergo: Alice is unable to do otherwise than what she actually does. 


Lewis claims that statement 3 is ambiguous, in that it fails to distinguish between the 
two senses in his two bullet points above. The Consequence Argument requires the 
latter (which is false), whereas this argument itself is unsound on the former (which 
is true). This disambiguation of assumption 3 in the Consequence Argument, then, 
is supposed to save (compatibilist) free will. However, a considerable philosophical 
literature suggests that the tension between Lewis’s denying the second bullet point 
whilst accepting the first is pretty uncomfortable, reflecting the corresponding ten- 
sion between the conjunction of determinism and freedom in general; indeed, this is 
what the FWT makes precise! Let us first point out that, at least in his terminology 
Lewis fails to make a clear distinction between laws of nature and initial states; 
from the point of view of modern physics, this distinction is absolutely fundamental 
(although it may dispappear in post-modern physics based in e.g. quantum gravity). 
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Lewis’s examples of law-breaking events in our actual world typically refer to 
violations of some law of nature (like exceeding the speed of light), whereas the (al- 
leged) law-breaking in his counterfactuals, such as choosing a’ (where in fact Alice 
did not do so) amounts to a change in some earlier state. Thus it might have been 
more appropriate if the paper in which Lewis laid out his version of compatibalism 
had been entitled Are we free to change the states? instead of Are we free to break 
the laws ?. On this revision, his distinction of the two cases takes the following form: 


e Iam able to do something such that, if I did it, the state of the actual world at 
some earlier time would have been different. 
e Iam able to change the actual state of the world. 


The latter remains impossible, while it is the former that enables free will. Applied 
to Alice, the former should mean (still in the compatibilist spirit of Lewis): 


e A slight alteration in the state of the actual world (which would have made it a 
different but very similar world according to Lewis) would have led Alice to do 
something (such as choosing a’) that she did not do in the actual world (because 
according to determinism its actual state at any earlier time—as opposed to the 
counterfactual alternative state in the discussion—led her to choose a). 


We now make this revised version of Lewis’s local miracle compatibilism math- 
ematically precise, in a way that has the additional advantage of involving not only 
“the ability do do otherwise’, but also the other component free will, i.e. agency. 
Here the intuition is that free will involves a separation between the agent, Alice, 
(who is to exercise it) and the rest of the world, under whose influence she acts. 
Namely, as in the FWT, let X be the state space of the Universe, and let 


a=A(x) (6.55) 


again be Alice’s setting, where A : X — X,, as before. We now assume that a is 
determined by her “inner state” J as well as the “outer state” O of the rest of the 
world, under whose influence she acts. These, in turn, are determined by the state 
x € X of the world. That is, A = A(O,/), which expresses the existence of functions 


O:X >Xo; (6.56) 
I: X > Xz; (6.57) 
AtXo%XpS Xa, (6.58) 


where Xo and X7 are certain sets, such that for each x € X one has 
A(x) =A(O(x),1(x)). (6.59) 
In other words, for some given state x of the world we have 


o = O(x); (6.60) 
ew ial (6.61) 
a = A(0,i). (6.62) 
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Note that, in the spirit of Conway and Kochen, in the above analysis Alice (whose 
free choice they after all believe to be ultimately a consequence of the free choice 
of elementary particles) now plays the role of the spin-1 particles in the bipartite 
experiment. Thus the analogy is between the triples: 


(a,z,A) €X4xZxA; (6.63) 
(0,i,a) € Xo x X1, x XA. (6.64) 


e The first triple is defined in the experimental context of the FWT, where a is the 
setting of Alice’s wing of the experiment (which from the perspective of the spin- 
1 particle plays the role of the outer state of the world), zis the inner state of the 
particle, and A is the outcome of Alice’s measurement. 

e The second pertains to the analysis of Alice’s “free” choice of the setting of her 
experiment, where o is the outer state of the world, i is her inner state, and a is 
her actual setting, given x € X and hence (0,i) = (O(x),1(x)). 


Beyond Determinism, which is expressed by the above framework, our funda- 
mental assumption underpinning compatibilist free will is Freedom, defined exactly 
as in the FWT: O and J are independent in that the following function is surjective: 


OxI:X > Xo X X7 
xt (O(x),1(x)), (6.65) 


i.e., for each pair (0,7) € X; x Xo there is x € X for which (6.60) and (6.61) hold. 
Rephrasing our earlier analysis in this elementary mathematical language, Lewis 
wants to make sense of the idea that although Alice’s choice (6.62) at some fixed 
time ¢ was determined by the state x of the Universe at that time through (6.60) - 
(6.61), or, equivalently, through (6.59), and hence—and this is the whole point of the 
Consequence Argument Lewis challenges—by any earlier state x, of the Universe 
at time t,, nonetheless Alice was “able to act otherwise” at time f, e.g. in choosing 


a’ =A(o',’’), (6.66) 


but did not do so, since choosing a’ would illegally have changed the state x to x’ 
(both at time ¢), and, equivalently (given determinism), would have changed x, to 
Si On our reading of Lewis’s theory of counterfactuals, Alice’s ability to choose a’ 
simply means that there exists a state x’ of the world close to x in the sense that 


O(x’) = O(x) =o, (6.67) 
making the environment in which Alice acts the same as in the actual world, but 
eo (Ge a ie alee (6.68) 


where i’ should be close to i in some appropriate sense (such as a slight change in 
the state of Alice’s brain), such that (6.66) holds, with o’ = o as required by (6.67). 
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The point, then, is that according to our Freedom assumption, there indeed is such 
a nearby state x’, for any given i’ and (0,i). Thus the freedom Alice has is precisely 
what we have formalized as Freedom: even given the state o of the causal influences 
on her behaviour (and possibly even the entire state of the rest of the world), there is 
a different admissible state x’ of the world such that, had this state been actual, she 
would have chosen a’ (although she in fact, necessarily, picked a). 

It should be clear now that at least in the context of the Free Will Theorem, our 
precise technical formulation of all assumptions implies that the freedom Alice and 
Bob have in choosing their settings is an instance of the local miracle compatibilist 
form of free will proposed by Lewis (1981), at least if one accepts our reformulation 
thereof. The theorem then establishes a contradiction between: 


e the physics assumptions, i.e., Nature, and Locality; 
e the compatibilist free will assumption, i.e., Determinism and Freedom. 


Accepting the former, the latter must fall. Making this choice, one should realize that 
the physics assumptions on the one hand just form a small corner of modern physics 
(from which point of view they are weak), but on the other hand have singled out 
the corner in which the two fundamental theories of quantum mechanics and special 
relativity meet and are brought to a head (from which perspective they are strong). 

The challenge their theorem puts to compatibalism was recognized by Conway 
& Kochen (2009), who write: 


‘The tension between human free will and physical determinism has a long history. Long 
ago, Lucretius made his otherwise deterministic particles swerve unpredictably to allow 
for free will. It was largely the great success of deterministic classical physics that led to 
the adoption of determinism by so many philosophers and scientists, particularly those in 
fields remote from current physics. (This remark also applies to “compatibilism”, a now 
unnecessary attempt to allow for human free will in a deterministic world.)’ 


This quotation does not use a precise version of compatibilism, but, as Conway 
explains elsewhere, what they mean is that compatibilism in whatever form was 
a desperate pre-twentieth-century attempt to save the notion of free will for e.g. 
Christianity in the face of the physics of the time, which assumed that the universe 
was a mechanical clockwork. Such attempts, then, would no longer be necessary 
if the world is, in fact, indeterministic (as Conway and Kochen claim to have at 
last proved). Our reformulation of their theorem (which removes the threat of cir- 
cularity) gives a more subtle picture: the FWT uses modern physics to challenge 
one particular version of compatibilist free will. As such, it only provides indirect 
support for libertarian free will, namely by weakening one of its competitors. 

To close this philosophical intermezzo, let us note that determinism is seen as 
a property of theories. Since it is the job of a deterministic theory to predict the 
outcome of any experiment, whether or not it is performed, this obviates the need for 
assumptions like counterfactuality in the sense that ‘unperformed experiments have 
results’ (which was famously denied by Asher Peres). Such controversial notions of 
counterfactuality have effectively been replaced by the considerably more refined 
modal counterfactuality of Lewis (at least in our slight reformulation thereof). 
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6.4 Technical intermezzo: The GHZ-Theorem 


The essence of the proof of the Free Will Theorem lies in the argument that per- 
fect correlation together with context-locality implies non-contextuality. Remark- 
ably, context-locality is at the same time a special case of non-contextuality, as the 
following example illustrates. We take H = C? ® C?, equipped with the Bell basis 


vo = (|O1) —|10))/v2; (6.69) 
v1 = (|01) +|10))/Vv2; (6.70) 
v2 = (|00) —|11))/v2; (6.71) 
v3 = (00) +|11))/v2, (6.72) 
where we use the physicists’ notation 
li) = (1,0); (6.73) 
|0) = (0,1); (6.74) 
li7) = |i) ®|y). (6.75) 


Of course, C? @ C* & C4 contains the spin-1 Hilbert space C? of the Kochen— 
Specker Theorem as the subspace orthogonal to the vector Uo. Thus we identify C? 
with the subspace C? of C* spanned by the basis vectors 01, 02, 03. The operators 


Ju = 4(Ou® ln +12 @ Oy), (6.76) 


where u € R? is a unit vector as before, and 
3 . 
Ou = Y\ o'u; (6.77) 
i=l 


in terms of the Pauli matrices o’, map v, to zero and leave its orthogonal comple- 
ment C? stable. Elementary group theory or direct calculation then shows that the 
operator Jy on C? in (6.11) is (unitarily) equivalent to the operator Jy on C?. Since 


RF = 1(6y@ Ou +12@ 12), (6.78) 


the Kochen—Specker argument can be rephrased in terms of the operators Oy © Oy. 
In particular, for each frame a = (u), U2, Us), the three operators 


(Ou, & Ou, ) Ou, & Ou, ? Ou, & Ou; ) (6.79) 
commute, they each square to one, and their joint eigenvalues are one of the triples: 


(—1,-1,-1), (—1,1,1),(1,-1,1),(,1,-1). 
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The eigenvector corresponding to the first one is Uo, and hence the others must lie 
in C>. Hence by Lemma 6.4 any quasi-linear non-contextual hidden variable must 
also assign these values, which by Lemma 6.7 is impossible for arbitrary bases. 
The key mathematical property of the three operators (6.79) is that they commute, 
and together with the unit 12 ® 12 form a maximal set of commuting self-adjoint 
matrices on C+. But other such sets could have been chosen by Alice (under whose 
sole control the situation so far has been assumed to be), such as a triple of the kind 


(Oy ® Iz, 12 @ Oy, Ou ® Oy), 


where u and v are arbitrary unit vectors in R*. Since the third operator is the product 
of the first two, the joint eigenvalues of this triple, and hence also the assignments 
by a quasi-linear non-contextual hidden variable, must be one of the four triples 


(1,1,1),(—1,1,-1),(1,-1, -1), (-1,-1,1). 


The non-contextuality assumption would then dictate that the outcome of Alice’s 
measurement of Oy © 12 be independent of her choice of the setting v in a possible 
simultaneous measurement of 12 ® Oy, and vice versa. Therefore, in a (non-local) 
bipartite setting where Alice is only able to measure operators of the type a @ 12, 
whilst Bob can measure 12 @b, on the above choice of (commuting) operators, 
non-contextuality in the situation where Alice controls everything is mathematically 
equivalent to (context) locality in the bipartite Alice & Bob setting. 

Further constraints then arise if the system is prepared in a correlated state like 
Wo, which is an eigenstate of Oy © Oy with eigenvalue —1 whenever u = v. So in that 
case the values of (Oy ® 12, 12 ® oy) can only be (1, —1) or (—1, 1), yielding perfect 
anti-correlation. This is not enough, however, to derive a Free Will Theorem; to do 
so with the small single-site Hilbert space C?, one needs a third (non-local) party. 

Indeed, the well-known tripartite GHZ-argument may be rephrased as a Free Will 
Theorem, as follows. The underlying Hilbert space is 


H=CaCaC’2=c§, (6.80) 


and hence as a warm-up we first (re)prove Theorem 6.5 for n = 8. Suppose we have 
a map V : Hg(C) > Ras in Definition 6.1. Write 


A =V(6q@ 1 @ 19), Ab” = V (12. @ op @ 1p), AL) =V (12 @ 12 @ 0), 


where a,b,c can be 1,2,3. From Lemma 6.4 we then have 


V (01 8628 Gy) = AV AMA, (6.81) 
V (02861 @ Gy) = AP AWA, (6.82) 
V(02.8 0.01) = APA; (6.83) 
V(0, 86,801) = AP ALD. (6.84) 
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Furthermore, the four operators on the left-hand side commute and turn out to satisfy 

01 © 02 ® 02-02 ® 01 © 02-02 B 02H O| = —01 YO BO, (6.85) 
so that again by Lemma 6.4, 


Ma) aga) aga) ——aMgMa, (6.86) 


ie. (AM Ag = —1. Since A = +1, this is impossible, so that V cannot exist. 
Now, using the notation in the preceding discussion, consider the unit vector 


Wenz = (|111) —|000))/v2, (6.87) 


which is a joint eigenstate of each of the four operators on the left-hand side of 
(6.81) - (6.84), with eigenvalue +1 for the first three, and hence eigenvalue —1 
for the fourth, i.e., 0; ® 0; @ 0}. So if setting A = a for Alice (where a € {1,2}) 


means that she measures F = 6, ® 12 ® lo with outcome ai” = +1, and similarly 
B=b for Bob and C =c for Cindy mean that they measure G = 12 ® 0, © In and 
H = 1)@12®o, with outcomes at?) = +1 and at? = +1, respectively, then in the 


state Woyz each of the settings gives the correlation 
settings (a,b,c) = (1,2,2),(2,1,2),(2,2,1) = Aa al =1; (6.88) 
setting (a,b,c) = (1,1,1) + A Aa? = -1. (6.89) 
Theorem 6.14. The conjunction of the following assumptions is contradictory: 
e Determinism: there is a state space X with associated functions 
A,B,C: X > {1,2},F,G,H:X OA, 


which completely describes the experiment, in that x € X determines both settings 
(a,b,c) and outcomes (Aj, A2,A3) € A? through a = A(x), Ay = F(x), ete. 

e Nature: the experiment (performed in the state Wouz) has possible outcomes in 
A = {1,1}, subject to the correlations (6.88) - (6.89); 

e Freedom: there is a further function Z : X — Xz, in terms of which 


F =F(A,B,C,Z), G=G(A,B,C,Z), H =H(A,B,C,Z), 
and F, G, H, Z are independent, i.e. for each (a,b,c,z) there is x € X such that 
A(x) =a, B(x) =b, C(x) =c, Z(x) =z. 
e Locality: F = F(A,Z), G=G(B,Z), and H = H(C,Z). 


Proof. Using notation as in the proof of Theorem 6.13, for fixed z € Z we obtain 
F(a,z) = a) etc. Nature then leads to the contradiction derived after (6.86). 
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6.5 Bell’s theorems 


Two different results are known as “Bell’s Theorem’: the first, from his paper in 
1964, is Theorem 6.15 below, and the second, dating from 1976, is Theorem 6.18. 
The first is similar to the Free Will Theorem in both its assumptions and its conclu- 
sion, and to make this similarity more obvious we first state it for C> instead of C?. 
The difference lies in the probabilistic flavour of Bell’s Theorem, whose empirical 
input is not given by the only non-probabilistic consequence to be drawn from the 
quantum-mechanical formulae (6.35) - (6.38), viz. the certainty (6.43) of perfect 
correlation on identical settings, but rather by the probabilistic formula (6.40), i.e., 


Py (Fi # G;|Ai = uj,B; = vj) = Zin? ,y, (i,j =1,2,3), (6.90) 


where @y,y is the angle between two unit vectors u and vy. Furthermore, the state 
space X must be upgraded to a probability space (X,2,), carrying functions A 
and B (for the settings, which unlike Bell himself—who treated them as labels— 
we include among the random variables), F and G (for the outcomes) and finally Z 
(for the hidden variable traditionally called 2) as random variables, i.e., measurable 
functions. This also implies that the target spaces X4 to Xz (which is traditionally 
called A) must be equipped with some o-algebra of measurable subsets. But this is 
not a big deal, since X4 = Xz carries a natural Borel structure and Xr = XG is finite. 
The probability measure [ is assumed independent of (A,B, F,G), and vice versa. 
The measure 1, which gives the “hidden state” of the system that allegedly un- 
derlies its quantum-mechanical description, is chosen in such a way that empirical 
probabilities (typically obtained from long runs of repeated measurements) are re- 
covered as joint conditional probabilities defined by w and the random variables, 
i.e., assuming the settings (a,b) are possible in that P(A = a,B = b) > 0, we put 


P(F =A,G=y,A=a,B=b) 


P(F A B=b 91 
(F =1,G =A =a,B=b) Crh = ae ae 
where the joint probabilities on the right-hand side are given by 
P(A=a,B=b) = p(A=a,B=b)}; (6.92) 


P(F =1,G=y,A=4,B =b) =yU(F =1,G=y,A=a,B=b}, (6.93) 


where (A = a,B = b) is shorthand for u(x € X | A(x) = a, B(x) = dD}, etc. This 
implies that u depends on (but may not be determined by) the quantum state yo. 

On this understanding, the assumptions of Determinism and Locality are the 
same as for the Free Will Theorem (except that equations like F (x) = F (A(x),Z(x)) 
are merely supposed to hold almost everywhere with respect to UW). Freedom is 
now taken to mean that (A,B,Z) are probabilistically independent relative to u. By 
definition, this also means that the pairs (A,B), (A,Z), and (B,Z) are independent, 
so that for any A C X,, B C Xz, and (measurable) Z C Xz, defining 


P(A EA,BEB,ZEZ)=pU(x EX | A(x) €A, B(x) € B,Z(x) € Z), (6.94) 
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and analogous expressions for P(A € A) and P(A € A,B € B), etc., we have 


P(A € A,B EB) = P(A E€A)P(BEB); (6.95) 
P(AEA,ZEZ) = P(AECA)P(Z EZ); (6.96) 
P(BEB,ZEZ) = P(BEB)P(Z EZ); (6.97) 
P(ACA,BEB,ZEZ) = P(ACA)P(BE B)P(Z EZ). (6.98) 

If we finally define Nature as the claim that F and G are 2-valued and that 
P(F; # G,|A; = u;,B; = vj) = 3 sin’ 6,4, (i,j =1,2,3), (6.99) 


where the left-hand side is the conditional probability defined by u and the random 
variables in question (whereas the left-hand side of (6.90) is the empirical probabil- 
ity for the experiment in question, or, equivalently, the quantum-mechanical predic- 
tion thereof), then we obtain the following spin-1 version of Bell’s first theorem: 


Theorem 6.15. Determinism, Freedom, Nature, and Locality are contradictory. 


This formulation is literally the same as Theorem 6.13, but the terms have acquired a 
different technical meaning now, especially Freedom and Nature. Moreover, purists 
would add Probability Theory as an assumption in Bell’s Theorem, as its formalism 
is decidedly non-tautological and its interpretation is far from obvious, even in a 
classical setting. In any case, the proof is practically the same as in the more familiar 
optical version of the EPR-experiment, to which we now turn. 

In the classical (sic) form of the experiment, Alice and Bob perform measure- 
ments on incoming photons by letting them pass through a polaroid glass whose 
axis of polarization makes angle a (Alice) or b (Bob) with (say) the horizontal axis 
in the plane orthogonal to the direction of propagation of the photons. Considered 
in the light of the previous experiment on spin-1 particles, such a choice of settings 
may also be seen as a choice of basis for R*, with the proviso that, assuming (by 
convention) the photons move along the y-axis, one basis element uy = (0, 1,0) is 
fixed so that the remaining two vectors (u;,u3) must lie in the x-z plane (in which, 
on a naive picture, the photons may “vibrate’”’). This constraint gives rise to bases 


u; = (cosa,0,sina),u2 = (0,1,0),u3 = (—sina,0,cosa), (6.100) 


the first of which (say) gives the actual direction of the axis of polarization. In any 
case, Alice writes down F = 1 if her photon passes her glass at angle a, and F = 0 
if it does not; similarly Bob writes G = 1 (pass) or G = 0 (fail) at setting b. 

In a quantum-mechanical description of the experiment, the Hilbert space of the 
photon pair is C? @ C?, and the correlated photon state is taken to be 


Yo = (e1 Be; +e) @en)/V2, (6.101) 


where e; = (1,0) and e2 = (0,1) form the standard basis of C*. The probabilities 
(6.35) - (6.38) as predicted by quantum mechanics are now replaced by 
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Py)(F =1,G = 1|A =a,B =b) = }cos”(a—b); (6.102) 
Py(F =0,G =0|A =a,B =b) = }cos*(a—b); (6.103) 
Py) (F = 1,G =0|A =a,B =b) = }sin’(a—b), (6.104) 
Py) ((F =0,G = 1|A =a,B =b) = \sin’*(a—b), (6.105) 


which are also the experimentally measured ones. Instead of (6.90) we then obtain 


Py, (F # GIA = a,B =b) = sin’ (a—b); (6.106) 
Py(F =G|A =a,B =b) = cos*(a—b). (6.107) 


In particular, if their settings are the same (i.e., a = b), then Alice and Bob will 
always find the same outcome (perfect correlation), whereas in case they are or- 
thogonal (i.e., a= b+7/2), they obtain perfect anti-correlation, in that Alice’s 
photon passes whenever Bob’s is blocked, and vice versa. However, this will not be 
used. Although it should be obvious from the previous case what the assumptions 
in Theorem 6.15 mean for this particular experiment, we make them explicit: 


e Determinism means that there is a probability space (X,2,) with associated 
(measurable) functions 


A:X > [0,2],B:X — [0,2],F : X — {0,1},G:X — {0,1}, (6.108) 


which completely describe the experiment in the sense that x € X determines 
both its settings a = A(x),b = B(x) and its outcomes A = F(x), y = G(x). 

e Freedom stipulates that there is a (measurable) function Z : X — Xz such that: 
- F=F(A,B,Z) and G = G(A,B,Z); 
— (A,B,Z) are probabilistically independent relative to L. 

e Locality means that F(A,B,Z) = F(A,Z) and G(A,B,Z) = G(B,Z). 

e Nature states that the empirical as well as theoretical probabilities (6.106) for the 
experiment are reproduced as conditional joint probabilities given by through 


P(F £G|A =a,B =b) =sin’(a—b). (6.109) 
Theorem 6.15 then holds verbatim for this situation, with the following proof. 


Proof. Determinism and Freedom imply 


P(F =4,G=y|A =a,B =b) = Papz(F =A,G=y\A=a,B=b), (6.110) 


where we use the notation (6.50) - (6.51), the function A: Xa X Xp x Xz > Xq 18 
projection on the first coordinate, likewise the function B:X, x Xp x Xz > Xp is 
projection on the second, and P4g7z is the joint probability on X4 x Xg x Xz induced 
by the triple (A,B,Z) and the probability measure UW; by independence, P4gz is a 
product measure on X4 x Xz x Xz. According to Locality, F(a, b,z) does not depend 
on b, whilst G(a,b,z) does not depend on a. 
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For fixed settings (a,b), we may therefore define the following functions on Xz: 


F,(z) = F(a,z); (6.111) 
Gp(z) = G(b,z). (6.112) 

A brief computation then yields 
Papz(F =4,G = y|A =a,B = b) = Pz(Fu =A,G, =), (6.113) 


where Pz is the joint probability on Xz defined by Z and wu. Therefore, from (6.110), 


P(F =4,G=y\|A=a,B=b) =P;(F, =A,G =). (6.114) 


Nature then gives the crucial result 
Pz(F, # Gp) = sin?(a—b). (6.115) 
Lemma 6.16. Any four {0,1}-valued random variables (F,,F,G,,G2) satisfy 
PU, #G1) <P(Fi # G2) + P(Fy # G1) + P(Fa F G2). (6.116) 


This lemma (said to go back to Boole) is very easy to prove directly, but for com- 
pleteness’s sake we mention that it also follows from Proposition 6.17 below. 

Taking Fj = Fy; ,h= Fay, Gi = Gp, ,G2= Gp), and P = Pz, for suitable values of 
(a1, a2,b1,b2) this inequality is violated by (6.115). Take, for example, az = bz = 3x, 
a, = 0, and b; = x. The inequality (6.116) then assumes the form f(x) > 0 for 


f(x) = sin? (3x) + sin? (2x) — sin?(x). 


But in fact, f(x) <0 for continuously many values of x € [0,27], see plot. 


Graph of x sin? (3x) + sin?(2x) — sin?(x), showing (in the region where it is 
negative) that quantum mechanics violates the Bell inequality (6.116). 
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Lemma 6.16 is a special case of a more general result. 


Proposition 6.17. Let F; : X — [—1,1] and G; : X — [-1,1], where (X,X,u) is 
some probability space, be two parametrized random variables, i, j = 1,2. Then the 
two-point function (F;G;) = fy dpi FiG; satisfies the CHSH-inequality 


|(FiGi) + (FiG2) + (FG) — (F2Go)| < 2. (6.117) 


If F; and G; just take the values +1, then (6.116) is a special case of (6.117). 
Proof. In terms of the function ® = F; - (G; + G2) + Fy: (Gi — G2), we may write 


(F{G,) + (Fi G2) + (F2G1) (F:G2) = [ due, (6.118) 


Since |F;(x)| < 1 and |G;(x)| < 1 by assumption, we have |®(x)| < 2 and hence 


| [ ue) 20) 


< f dus)|®(w)| <2, (6.119) 
x 


since Ll is a probability measure. To prove the the last claim, we just note that 


P(Fi = Gj) —P(Fi # Gj) = (HG)); 
P(Fi=G,)+P(R#G;) = 1. 


In Bell’s second (1976) theorem on stochastic hidden variables, the assump- 
tion of Determinism is dropped, and all we have is a theory stating conditional 
probabilities P(F = A,G = y|A =a,B = b,x) for the outcomes of the above bi- 
partite experiment given some hidden variable x, as well as the single-wing versions 
P(F =A|A =a) and P(G = y|B = b,x). Here F,G,A,B are just notational devices 
to record such outcomes, which are no longer (necessarily) represented as random 
variables. On this new understanding of the notation, the Nature assumption is for- 
mulated just as before, cf. (6.109). We do assume the existence of a probability 
space (X,2,) and of conditional probabilities 


P(F =1,G=y|A=a,B=b,x), P(F =A\|A=a,x), P(G=y7|B=b,x), 


defined [-a.e. in x, in which the state of the world is specified as being x € X. In 
terms of this space, the Freedom assumption means that 


P(F =A,G=7|A=a,B=b) [au Per 1,G=y|A =a,B=b,x), (6.120) 
xX 


for any settings (a,b), of which pt is independent (as the notation already indicated). 
The crucial assumption replacing Determinism is Bell locality, which reads 


P(F =1,G=y|A =a,B=b,x) =P(F =A|A =a,x)-P(G=y|B=b,x). (6.121) 


Bell’s second theorem for stochastic hidden variable theories reads as follows. 
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Theorem 6.18. Nature, Freedom, and Bell locality are contradictory. 


Proof. The idea of the proof is to introduce an artificial probability space in order 
to recover the framework of Theorem 6.15. To this end, we take 


X = [0,1] x [0,1] x xX; (6.122) 
djt(s,t,x) = ds-dt-du(x). (6.123) 


where we denoted the elements of X by (s,t,x). On X, define random variables 


Fi(s,t,x) = 110,P(F=1|A=a,x)] (s); (6.124) 
Gp(s,t,x) = 1jo,p(G=1\B=0,)) (4) (6.125) 


where |, is the indicator function for A C [0,1]. Writing, as usual, 
PR, =4,G,=7) = [dato {Oost.9) EX | Fa(s,t,x) =A, Gy(s,t,x) = 7}, 
we obtain (first for A = y = 1, from which the other cases follow): 
PF, =A, =) = [ducer =A\A a,x): P(G=7/B=3,x)- 6126 


With Freedom and Bell locality, this yields 


P(F =A,G=y|A=a,B=b) =P(F, =1,G, =), (6.127) 


as in (6.114), so that the proof may be completed as for Theorem 6.15. 


Let us note that since in Bell’s second theorem the settings (a,b) are treated as free 
parameters to begin with, the difference between X and Z evaporates, so that in the 
above formulae one might as well have replaced (X, 1) by the space (Xz, Wz) that 
describes all relevant degrees of freedom except the settings (i.e., the experimental- 
ist, in either human or machine form). Either way, Bell’s locality condition may be 
disentangled into the following conditions (introduced by Jarrett and Shimony): 


1. Parameter Independence (P1): 


P(A|a,b,x) = P(A|a,x); (6.128) 
P(y\a,b,x) = P(y|b,x); (6.129) 


2. Outcome Independence (01): 


P(A\a,b,7,x) = P(Ala,b,x); (6.130) 
P(yla,b,A,x) = P(yla,b,x), (6.131) 


where we have abbreviated P(F = A|A =a,B =b,x) by P(A|a,b,x), etc., and have 
used the following notation (which states identities in case one has (6.91) - (6.93)): 
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P(Ala,b,x) =) P(A, yIa,b,x); (6.132) 
x 
P(yla,b.x) =) P(A, yla,b,x); (6.133) 
A 
P A, 15, 
P(Ala,b,¥,x) = sree (6.134) 
P A, 15, 
P(yla,b,A,x) = series (6.135) 


It is easy to see that Bell locality is equivalent to the conjunction of PI and OI. 
Note that the former (PI), akin to Locality, is a hidden or ‘subsurface’ version of 
the no signaling property of the ‘surface’ probabilities, which states that 


P(Ala,b) = yi P(A, yla,b) 
Y 


is independent of b (and vice versa). But a violation of PI only leads to signaling if x 
can be operationally controlled, similar to the way in which experimental physicists 
prepare quantum states y. Hence it is reassuring that quantum mechanics satisfies 
PI if we see the quantum state y as a hidden variable: assuming 


P(A, 7\@,0,x) =Py(F =4,G= y/A=4,B =b), (6.136) 


as computed in (6.102) - (6.105), PI is valid but OT is not. First, for A =O ord = 1, 


P(Ala,b,x) = Y Py (F =A,G= yla,b) = }cos*(a—b) + 4sin’(a—b) =}, 
y=0,1 
(6.137) 
which is independent of b, and likewise P(y|\a,b,x) = 5, independently of a. This 
yields PI, which a similar computation shows to be true for any quantum state. On 
the other hand, given this result, OI would require 


Py(F =4,G = y|A =a,B =b) =Py,(F =A|A =a) - Py, (G=Y/B =b), 


which is false, since by (6.102) - (6.105), Alice’s and Bob’s outcomes are correlated. 

Hence Bell locality is violated by quantum mechanics, but this does not imply 
that “quantum mechanics is nonlocal” (as some say). Bell’s is a very specific locality 
condition invented as a constraint on hidden variable theories. In another important 
sense, viz. Einstein locality, quantum mechanics is local, in that observables with 
spacelike separated localization regions commute (this is the case in quantum field 
theory, but also in any bipartite experiment of the type considered here, where Al- 
ice’s operators commute with Bob’s just by definition of the tensor product). 

On the other hand, deterministic theories, which in the present context are defined 
as those for which all conditional probabilities like P(A, y|a,b,x) are either zero or 
one (in which case one may introduce random variables reproducing these probabil- 
ities), violate PI but satisfy OI, at least if they reproduce the Born probabilities (such 
as Bohmian mechanics). Hence such theories violate Bell locality. 
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Finally, Bell-type inequalities like (6.117) also give information about quantum 
mechanics itself, particularly about the degree of entanglement of states. Let H; and 
Hy be Hilbert spaces, with tensor product Hj ® H2. A unit vector y € H) & Ap is 
called uncorrelated if it is of the form y = @; ® @2, where @ € Hx are unit vectors, 
k = 1,2, and correlated otherwise. Clearly, the vectors (6.34) and (6.101) used in 
the experiments so far are correlated. The simplest result is then as follows. 


Theorem 6.19. Let a, and ap be self-adjoint operators on H,, and let b, and bz be 
self-adjoint operators on Hp, each with spectrum contained in |—1, 1] (equivalently 
|Xal| <1, etc.). Let y be a unit vector in H, ® Ho, and define two-point functions 


(AG)) = (Wa; b;W). (6.138) 
If w is uncorrelated, then the Bell inequality (6.117) holds. 
Proof. This follows from the factorization property 
(FiGj) = (1 @ Pr, 4; @ bj Pi @ Pr) = (1, 4:1)  (P2,bj 2) = (Fi) - (Gj), (6.139) 


where (F;) = (@1,a;91) and (Gj) = (@2,b;@2). For either sign, this property yields 


(Fo(G1 — G2)) = (Fo) (Gi) (1 £ (Fi) (G2)) — (a) (G2) (14 (F1)(Gi)). (6.140) 


~S- DY 


The spectral assumption implies that |(F;)| < 1 and |(G;)| < 1, which will be used 
directly below, as well as its consequence |1 + (F,) (G2)| = 14 (F;)(G;). Hence 


|(Fo(Gi — G2))| < [14 (Fi) (Ga)| + [1 Fi) (Gi) 


= 1+(F,)(Go) +14(F))(Gi) 
=2+(Fi(G; +G)). (6.141) 
Similarly, 
\(Fi(Gi +G2))| < 2+ (F(G1—-G)), (6.142) 


so that, writing ® = (F{G,) + (FiG2) + (FoG)) — (F2G2), for either sign + we have 


|B] <|(Fi(Gi + G2))| +|(Fo(Gi — G))| < 44 (6.143) 


If d > 0 we choose the minus sign, whereas for ® < 0 we take the plus sign. Either 
way, we obtain |®| < 2, which is the inequality (6.117). 


This result is actually much more general (as hinted at by the way that the proof 
only uses the uncorrelated vector state y = @1 & @2). The simplest generalization 
is to replace pure states by mixed states, where we say that a density operator p 
on AH & M2 is uncorrelated if it is of the form p = );; piP1 ® P2, where the p; are 
probabilities and p, is a density matrix on Hy, k = 1,2. Then all uncorrelated density 
matrices satisfy the inequality (6.117). Even more generally, uncorrelated states on 
C*-algebras or von Neumann algebras A ® B satisfy (6.117), see Notes. 
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6.6 The Colbeck—Renner Theorem 


One may try to strengthen Bell’s second theorem by weakening its assumptions. A 
remarkable result in this direction states that, roughly speaking, any probabilistic 
hidden variable theory that satisfies Freedom and Parameter Independence and is 
compatible with quantum mechanics adds nothing to quantum mechanics. In other 
words, it appears that quantum mechanics “cannot be extended”, or “is complete”. 
In fact, the result turns out to be more modest than this summary suggests, since 
the reasoning required to prove the claim hinges on certain assumptions which are 
satisfied by quantum mechanics itself, but might seem unnatural for a hidden vari- 
able theory. In any case, we have to state our notation and assumptions very clearly. 


Definition 6.20. A hidden variable theory underlying quantum mechanics con- 
sists of a measurable space (X,) whose points x label conditional probabilities 


P(ay =M,..-,dn = An|x) = P(a =A |x) 


for the possible outcomes A = (Ay,...,An) of a measurement of any family a = 
(a1,..-,4n) of n commuting self-adjoint operators on any Hilbert space H. 
These formal conditional probabilities are a priori only supposed to satisfy 


0<P(a=Alx) <1; (6.144) 
Y Pla = Alx) =1. (6.145) 
A 


Apart from these probabilities, for each Hilbert space H and any pure state e € 
P\(H), the theory JF yields a classical state Ue, i.e., a probability measure on X. 


As the notation indicates, U. depends on e only and hence is independent of a and 
A. From the point of view of 7, a quantum state is a probability measure on X! In 
what follows we assume for simplicity that H is finite-dimensional, so that e = ey 
for some unit vector y € H. With slight abuse of notation we then write Uy for Me,,. 

An important special case will be the bipartite setting H = H) & Hz, where Alice 
and Bob measure self-adjoint operators X and Y on Hj and A), respectively, so that 


n=2, a; =X @lp, az = 1p, ®Y. 


We then introduce settings c = (a,b), as in the previous sections, so that we typically 

look at expressions like P(X, = (1, Y, = A,|x). The other case of interest will simply 

be n = 1 with a, =a, A, = A; indeed, this will be the case in the statement of the 

theorem (the bipartite case playing a role only in the proof, though a crucial one!). 
The following notation will be quite important to the argument. An equality 


Py(a=A\x) = a(x), (6.146) 


where a : X — [0,1] is measurable (often even constant), abbreviates: 


P(a=A|x) = a(x) for almost every x with respect to the measure Uy. 
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That is, there is a subset X’ C X such that Uy(X") =0 and Py(a = A|x) = a(x) holds 
for any x € X\X’. If X is finite, this simply means that the equality holds for any x 
for which Uy({x}) > 0. Since this notation may render equalities like 


Py(a = A|x) = Po(a’ = A'|x), (6.147) 


ambiguous, we explicitly define (6.147) as the double implication 


Py(a=A|x) = a(x) Po (a! A' |x) = a(x). 
Furthermore, for € + 0 we write 


Py(a =Alx) & Po(a! =A! |x)  Py(a =Alx) = Po(al = A'|x) + O(Ve), (6.148) 


as well as P 
yr os (l—-€) <|(y,g)| <1. (6.149) 


We are now ready to state our assumptions for the Colbeck—Renner Theorem: 


e Compatibility with Quantum Mechanics (CQ): for any unit vector y € H, 
[dts (o) Pla = Ale) Spyle=d), (6.150) 
where the quantum-mechanical prediction py(a = /) is given by the Born rule 
py(a=A) =(y,e)--el y), (6.151) 


cf. (2.21), where ei is the spectral projection on the eigenspace Hy, C H of qj. 
e Unitary Invariance (UI): for any unit vector y € H and unitary u on H, 


Pyy(a = Ax) = Py(u!au = A|x). (6.152) 


© Continuity of Probabilities (CP: If y © @, then Py(a = |x) © Pp(a = Alx). 


In the remaining axioms, H = H) ® Mp, and a and Db are self-adjoint operators on Hy 
and Hp», respectively (duly identified with operators a @ 1, and 1y, ®b on H). 


e Parameter Independence (PI): 


y P(a=A,b= yx) = P(a=A\x); (6.153) 
yeo(b) 
yy Pa=A,b=y\|x) = P(b= yx). (6.154) 
A€o(a) 


e Product Extension (PE): for any pair of states yw, © Hi, Wo € Ao, 


Py, (4 = A|x) = Py,ay, (a = Ax). (6.155) 
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e Schmidt Extension (SE): if v; € H; (i=1,...,dim()) are eigenstates of a, then 
for arbitrary orthogonal states u; € Hz and coefficients c; > 0 with Yc? = 1, 


a 


Py 6:9, (@ = |x) = Py ye;-0,u;(4 = x|x): (6.156) 


Note that PI makes sense, because (6.151) and (6.150) imply that for py(a = A) 
to be nonzero we must have A; € o(a;) for each i. All assumptions are satisfied by 
quantum mechanics itself (seen as a hidden variable theory with y as the “hidden” 
variable x). In the context of hidden variable theories, though, one might doubt the 
plausibility of UI, CP, and SE. But we need all these assumptions to prove: 


Theorem 6.21. /f 7 satisfies CQ, UI, CP, PI, PE, and SE, then for any (finite- 

dimensional) Hilbert space H, unit vector w € H, and operator a € B(H) sa, 
Py(a=A|x) = py(a=A). (6.157) 

In other words, the hidden variable x is even more hidden than expected, since know- 


ing its value has no effect on the probabilities for the outcomes of experiments. 


Proof. We first assume (without loss of generality) that a is nondegenerate as a self- 
adjoint matrix, in that it has distinct eigenvalues (A1,...,Agimc#)); this assumption 
will be removed at the end of the proof. The proof consists of three steps. 


1. The theorem holds for H = C? and any pair (a, y) for which 


Py(a=M1) = py(a= 2) = 1/2, (6.158) 


This only requires assumptions CQ, PI, and SE. 
2. The theorem holds for H = C!, / < ~ arbitrary, and any pair (a, y) for which 


Py(@=M) =++- = py(a=%) = 1/1. (6.159) 


This is just a slight extension of step 1 and uses the same three assumptions. 
3. The theorem holds in general. This requires all assumptions (as well as step 2). 


Proof of step 1. Let H = C’, with basis (v1, 02) of eigenvectors of a, so that y € C? 
may be written as 
y = (0, + 2) /V2. (6.160) 


Without loss of generality, we may assume that A; = | and Ap = —1. We now relabel 


N > 1, putting 0, = k/2N, and defining 
Ck= COs a = COs (6.161) 


where, for any angle 0 € [0,27], the operator eg = |0) (6| is the orthogonal projec- 
tion onto the one-dimensional subspace spanned by the unit vector 


|0) = sin(@ /2) - v; + cos(@/2) - v2. (6.162) 
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In the bipartite setting, we have operators ax, = c, @ 12 and by = 12 @cg on C2@ C2, 
as well as a maximally correlated (Bell) state waz € C? @ C’, given by 


1 
= —(V 1 ® V1 + V2 ® V2). 6.163 
Was Ja 1 @ V1 + Vy ® V2) ( ) 


Using assumptions PI and SE, we then have, for i= 1,2 A; = 1, and A, = —1, 
Py(a = Ai|x) = Py (ao = Aix). (6.164) 
The quantum-mechanical prediction is 
Pyap (40 = 1) = Pyyg (40 = —1) = 3- (6.165) 
Our goal is to show that also for each x € X, knowing x is irrelevant in that 
Prag (Go = 1x) = Pyyg (ao = —1]|x) = 3. (6.166) 
To this effect we introduce the combination of probabilities 


1) (x) = P(ap = bow—1|x) + y P(ax bi |x), (6.167) 
keKy leLy,|k-l|=1 


where Ky = {0,2,...,2N —2} and Ly = {1,3,...,2N — 1}. Our first inequality is 


|P(az = Aj|x) — P(b; = Ajl\x)| = |P(ay = Ai, b) = Ailx) + Play = Mi, by F Ai|x) 
— Play = Aj,b) = Aj |x) + Plan 4 Ai, bi = Aj|x)| 
|P(ag = Aj, b; F Aj|x) — Play F Aj, by = Aj|x)| 
< Plag = Aj,b; F Aj\x) + Pag A Ai, by = Ai|x) 
= P(a, # b)|x), (6.168) 


where i = 1,2, and we used PI. This implies a second inequality: since a2) = —ao, 


|P(ao = 1x) — Play = —1|x)| = |P(ao = 1x) — P(azy = 1|x)| 


< Ye [Plage = 1x) —P(b; = 1|x)| 
kl|k=t|=1 


< YO Play #by|x) <1 (x). 
kL |k-lJ=1 


Integrating this with respect to the measure [y,, and using CQ gives 


i d yyy (x) |P(ao = 1|x) — P(ao = —1|x)| < . dpoyyy (x) I (x) = Ty), (6.169) 


We wish to invoke the corresponding quantum-mechanical expression, defined by 
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i = Pag (Go = bon-1) + y Pwap (ae #51): (6.170) 
keKy JELy |k-l|=1 


A straightforward calculation shows that this expression is equal to 


1) = 2Nsin?(n/4N). (6.171) 


Since limy_+o01 9") = 0, letting N — in (6.169) therefore yields (6.166). From 
(6.164) we then obtain (6.158). 


Proof of step 2. Let H = C! and let (vi, be an orthonormal basis of eigenvectors 
of a, with corresponding eigenvalues A;, and phase factors for the eigenvectors 0; 
such that cj > 0 (and of course, );; oe = 1) in the expansion 


v= Vein. (6.172) 
The case of interest will be c} = --- = c; = 1/I, but first we merely assume that 
C1 = C2 (the same reasoning applies to any other pair), with A; = 1 and A, = —1 


(which involves no loss of generality either and just simplifies the notation). The 
other positive coefficients c; are arbitrary. Generalizing (6.166), we will show that 


Py(a=1|x) = Py(a=—1]z), (6.173) 


This shows that if two Born probabilities defined by some quantum state ey are 
equal, then the underlying hidden variable probabilities must be equal [y-a.e., too. 
Eq. (6.159) immediately follows from this result by taking all c; to be equal. 

As in step 1, we pass to the bipartite setting, introducing two copies of H = C! 
denoted by Hy = Hg = C!, and define the correlated state 


Was = Ve Vi @ Vj (6.174) 


in Hy © Hz. Eq. (6.164) again follows from assumptions PI and SE. Throughout 
the argument of step 1, we now replace each probability P(a, = A;,b; = y;|x) by an 
adapted probability P\) (a, = A;,b) = y;|x), defined as the conditional probability 


P') (ay = Ai, br = |x) = Plax = 4i,b1 = YI |Ai| = || = 1,2) 
Play = 44,61 = %, |Ai| = || = 11x) 


= , (6.175) 
P(lAi| = |] = 1x) 
for all x for which P(|A;| = |%| = 1|x) > 0, whereas 
P') (ay = 1i,b1 = Bx) =0 (6.176) 


whenever P(|A;| = |¥2| = 1|x) = 0. The same argument then yields (6.169), with P 
replaced by P() but with the same right-hand side. As in step 1, 


226 6 Classical models of quantum mechanics 


PY, (ao = 1x) = PY) (ao = —1[x), (6.177) 
which implies that 
Pyyg (do = 1|x) = Pyyg(a0 = —1 |x), (6.178) 


either because both sides vanish (if P(|A;| = |Y2| = 1|x) = 0), or because (in the op- 
posite case) the denominator P(|A;| = |%2| = 1|x) cancels from both sides of (6.177). 
Combined with (6.164), eq. (6.178) proves (6.173) and hence establishes step 2. 


Proof of step 3. This is the most difficult step in the proof, relying on a technique 
wittily called embezzlement (which we only need for maximally entangled states). 
We will deal with three Hilbert spaces, namely H = C!, H' = C’", and H"” = C” 
(where n = m% for some large N, see below), each with some fixed orthonormal 
basis (OE (vj), and (v;')_, respectively. Given a further number m; < m, 
we now list the nm basis vectors v;/ ® v' of H" @ H’ in two different orders: 


" / MN / " / " / " / N" fee 
1. VD] @vVj,...,V), @ Vj, Vy @ Vy,..., Vj), @ Vy,..., V] @Vi,,...,V, @ V3 
" / / nN / N / " / N / 
2. Vj @Vjz,..+, Vy @ Vp, Vz @ Vy, 06+, Vy @ Vigigs sey Vy @ Vz y0445Vpq @ Vig. seees 
where the remaining vectors (i.e., those of the form v;' ® vi for 1<k<nand j >m)) 
are listed in some arbitrary order. 


Define 
u") -H" @H' +H" @H' (6.179) 


as the unitary operator that maps the first list on the second. We will need the explicit 
expression 


ul) (vf @ vi) = v4 Qi, (6.180) 
“k k 
where for given k = 1,...,n the numbers si = 1,...,n; (where n; is the smallest 
integer such that njm; > n) and i =1,...,n; are uniquely determined by 
k= (si, —1)m;+ ji. (6.181) 


We will actually work with two copies of H" ® H’, called HY ® Hj, and Hz ® Hp, 

with ensuing copies of ul) and ul ) of ul), and hence, leaving the isomorphism 
Hi ® Hi, @ Hy ® Hy = Hi @H4 @H,@Hp (6.182) 

implicit, we obtain a unitary operator 

mj) 


uO”) @ ul” : H @ HE @ Hi, @ Hh > HY @ Hi @Hi, @ Hp. (6.183) 


The point of all this is that the unit vector 


ky € HY @ HY; (6.184) 


1 n 
Yi vf @ vf, (6.185) 


V/C(n) k=1 


kn = 
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where C(n) = Yy_, 1/k, acts as a “catalyst” in producing the maximally entangled 
state 


0 € Hi) OH: (6.186) 
_y 
Q= Vj, @ Vj, (6.187) 
vmi : : 
from the uncorrelated state vj ® vj € H/, ® Hp, in that for any mj; < m, 
is e/2 
A") Ul") (Ky @ vi @V1) & ky @@. (6.188) 


Here € = 1/N ifn = m2. This follows straightforwardly from (6.183) - (6.187). 


After this preparation we are ready for the proof of step 3, continuing to use the 
notation established at the beginning of step 2, especially (6.172). As in step 1, we 
introduce two copies Hy = Hg = C! of H, as well as two states 


Vas = Vici: 0; ¥; € Hy @ Hp; (6.189) 
Wag = Kn @ V1 @V, ® Wag € Ay’ @ Hy, (6.190) 

where kK, is given by (6.185), we put 
H" =H" @H'®@H, (6.191) 


and in our notation we have ignored the obvious permutations of factors in the tensor 
product. For any € > 0, pick c, € R* such that (c/)* € Qt and 


lc; —ci| < €/dim(A), (6.192) 


which implies that, in the sense of (6.149), we have 


2 
civ; ze >. ui. (6.193) 
Suppose 
c= / pil dis (6.194) 
with p;,q; € N and gced(p;,q;) = 1, and define 
m= pil la. (6.195) 
iAi 
Consequently, writing 
q=1/,[Yomy, (6.196) 


the following quotient is independent of i: 


228 6 Classical models of quantum mechanics 


/ 


Te =4q. (6.197) 

Given the integers m; thus obtained, we define a unitary operator 
u:H” +H", (6.198) 
u= x ul) & | vj) (vil, (6.199) 


i=1 


where wu") is defined in (6.180). From this definition, with additional labels to de- 
note the copies u4 : Hi! — H{’ and ug : Hy! > Hy’, and (6.188), and writing 


i = Vj@V; EHH’, (6.200) 

with corresponding copies 
oa € Hy, @H}; (6.201) 
an € Hg @ Hz, (6.202) 


we then obtain the important relations 


Lyn ® Ly (Wap) = = k,® Per ave Ooi. (6.203) 


i=l 


e 


ua @ Lyw (Wh) = Lvl ef obi @éiy: (6.204) 
B k 


Q 
oy 
iM~ 
iM: 
© Ss 


1 lon Cj ‘ a5 
Ing ual Vib) = FEY Tvl orl, Ek Epps (6205) 
i=1k=1 
€ Lom; " 
ua @up( Was) © 4 Kn® YY) Es @ Ear (6.206) 


ll 
uy 
= 
ll 
uy 


Here the right-hand sides of (6.203) - (6.206) have been arranged so as to obtain 
vectors in the six-fold tensor product 


Hi @H; @ Hy @ A, ® Hp ® Hp. 


We will repeatedly invoke the following lemma, whose proof just unfolds the 
notation (on the appropriate identification of a with a® 1, and of b with 1y, ®b). 


Lemma 6.22. Assume PI and UI. For any pair of unitary operators u; on Hy and 
uz on Ho, and any unit vector w © H, ® Mo, one has 


Puraln,)w2 = 1x) = Py(b = yx); (6.207) 
Pr (a =A|x) = Py(A =2\x). (6.208) 


1p, @u2)W 
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Since we assume that a is nondegenerate, there is a bijective correspondence 
between its eigenvalues a = A; and its eigenvectors v;. Instead of P(a = A;) dressed 
with whatever parameters x or y, we may then write P(v;), where a is understood, 
and analogously for the more complicated operators on tensor products of Hilbert 
space appearing below. Repeatedly using Lemma 6.22, we proceed as follows. 


e From Step 2, using the notation explained below (6.172), 


=» (€4 42 
Frias EM eli, Gao) = 1 (6.209) 


From (6.156) in PE and (6.209), 


P iii gti 
qXi,j; Saal @e pp! 


(Gilt) Sq? (6.210) 
e From (6.155) in SE and (6.210), 
i (et ape 
Pn ®E br eet, (Sep b=. (6.211) 
e From (6.211), CP (whose notation we use), and (6.206), 
2 
Paeinw (Gee be @. (6.212) 
e Recall the number m (satisfying m > m; for all 7). From (6.212) and Lemma 6.22, 
FU yy Sup) wit, (Epi lx) ar g Gi = = 1's smi); 
FO Sup) wis, (ee Ix) © 0 (ji =mj+1,...,m). (6.213) 


We now start a different line of argument, to be combined with (6.213) in due course. 


e From PE, SE, and (6.172), with oe € Ha denoting v; € H, we have 
Py(a = Ailx) = Py(vilx) = Py oy cy gig a, (U4 |x). (6.214) 
e Using Lemma 6.22, (6.203), and (6.204), 
Presadjer§ily Olly (PAR) = PU ymcup) yt (Val): (6.215) 


and hence 
Py(a =A;|x) = PU youn) wily (PA |x). (6.216) 


e From quantum mechanics, notably (6.151), and (6.205), for any i’ 4 i we have 
PU yu oun) vi (vf @ EV.) =0. (6.217) 


e From CQ and (6.217), for any i’ 4 i, 
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Pot yu oun) PA Spa) = 0. (6.218) 
e From PI, 
P(vi|x) = Y P(vi, €%|x); (6.219) 
ii 
P(Esp |) ~ 2 (yee): (6.220) 


e From (6.218), (6.219), and (6.220), 


PO yup) Wie (v4 |x) = Fa (1m up Wiis (eee x). (6.221) 
Finally, from (6.214), (6.221), (6.213), and (6.197) we obtain 


Py(a = Aj|x) £ye =m-g=c (6.222) 


Since c; > 0 we have c = = |e,|?; using (6.192) and letting € — 0 then proves step 3: 


Py(a=Aj|x) = |ci|? = py(a=Ai). (6.223) 


Finally, we remove our standing assumption that the spectrum of a be nondegen- 
erate. In the degenerate case one has 


= Vipw(v},); (6.224) 


where the sum is over any orthonormal basis (v;,);, of the eigenspace of A;. Simi- 
larly, since each vector vj, gives a = Aj, probability theory gives for all x, 


P(a=Aj|x) = LP V;,|X).- (6.225) 


The nondegenerate case of the theorem (which distinguishes the states v;,) yields 
Py(0j,|x) = py(j,), (6.226) 
from which (6.157) follows once again: 


Py(a = Ajlx) = Y° Py(v,,|x) = Liv Vj,) = Py(a= Aj). 


Ji 


Our proof of the Colbeck—Renner Theorem is now complete. 


Under less stringent assumptions this theorem might have been regarded as the 
conclusion of von Neumann’s program to disprove the possibility of completing 
quantum mechanics by adding hidden variables, but as yet this seems unwarranted. 
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Notes 


86.1. From von Neumann to Kochen-Specker 


‘For decades nobody spoke up against von Neumann’s arguments, and his conclusions were 
quoted by some as the gospel’. (Belinfante, 1973, pp. 24) 


Theorem 6.2 is due to non Neumann (1932, §Iv.2); it was the first result to impose 
useful constraints on hidden variable theories, anticipating all later literature on the 
subject. Unfortunately (as part of their general anti-Copenhagen rhetoric), Bell and 
his followers left the realm of decent academic discourse by calling von Neumann’s 
arguments against hidden variables ‘silly’ and ‘foolish’, through which they merely 
displayed the depth of their own misunderstanding of von Neumann’s reasoning; see 
Caruana (1995), Bub (201 1a), and especially Dieks (2016b). In fact, von Neumann 
(1932, p. 172) carefully qualifies his Theorem 6.2 by stating that it follows ‘im Rah- 
men unserer Bedingungen’ (i.e. ‘given our assumptions’), of which he earlier (on 
p. 164) admits that linearity is physically reasonable only for commuting operators, 
but nonetheless justifies this assumption through an ensemble argument (now out- 
dated, but by no means ‘silly’). Though couched in agreeable academic parlance, 
the earlier critique by Hermann (1935) was misguided, too (Dieks, 201 6b). 

The Kochen—Specker Theorem is due to Kochen & Specker (1967); the authors 
were originally logicians. A similar but less precise statement had appeared earlier 
in Bell (1966), who was not cited by Kochen and Specker; some authors refer to 
the Bell-Kochen-Specker Theorem. The Nature assumption has been experimen- 
tally verified, cf. Huang et al (2003). The proof of the fundamental Lemma 6.7 we 
present is essentially due to Kochen and Specker, as simplified by Peres (1995). Our 
independent proof for C+ is taken from Cabello et al (1996). Surveys of various 
proofs are given by Brown (1992) and Gould (2009); see also Waegell & Aravind 
(2012) and references therein, as well as Bub (1997) for another proof. From the 
Netherlands, we cannot fail to mention the short proof by Gill & Keane (1996). For 
geometric aspects (and even a link with M.C. Escher) see Zimba & Penrose (1993). 

One finds two opposite directions of research around the Kochen—Specker The- 
orem. A computational one, which seems hardly relevant to conceptual issues in 
physics (the goal rather being The Guinness Book of Records), consists of attempts 
to find a minimal set of vectors that cannot be coloured. See, for example, Pavicic 
et al (2005) for arbitrary dimension and Arends (2009) and Uijlen & Westerbaan 
(2015) for R?, the latter paper showing that at least 22 vectors are needed. 

The other, which is of significant conceptual importance and hence is worth some 
more extensive discussion, consists of attempts to find a maximal set of vectors that 
can be coloured. That is, one looks for large (preferably dense and measurable) 
subsets $2 of S* for which there exists a function V : $2 > {0,1} that satisfies: 


e V(—u) =V(u) for each u € S?; 
e V(u,;)+V(u2) +V(u3) = 2, for each (orthonormal) basis (u;,u2,u3) of R? 
whose elements lie in S2. 
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The first result in this direction was obtained by Meyer (1999) and Havlicek et al 
(2001), who showed that one may take 82 = §?Q; this choice was motivated by 
invoking finite precision arguments to circumvent the Kochen—Specker Theorem, 
see below. To write down a suitable function V : S77 Q? — {0,1}, we first define 
an auxiliary function S : S71Q? > Z by 


ee 6.227 
M3 gced(n1,n2,n3) ( ) 


ny no ng n3 Icm(m,m2,m3) 
my, m)’ m3 


where lcm is the least common multiple and gcd is the greatest common divisor of 
the argument. This function is obviously well defined. Then the following works: 


V (x,y,z) = O if S(x,y,z) is odd; (6.228) 
V(x,y,z) = 1 if S(x,y,z) is even. (6.229) 


More generally, for an arbitrary n-dimensional) Hilbert space H, with n < ~, 
Clifton & Kent (2000) proved the existence of a countable dense colorable subset 
P\(H). of Y\(H) (cf. Definition 6.9), with the additional property that different 
resolutions of the identity drawn from #;(H), never share a projection (so that the 
key strategy proof of Lemma 6.7, which is based on the existence of overlapping 
()y (e), ... of the countable set of 
all resolutions of the identity drawn from Y;(H)¢, so that each (e, sy el) isa 
basis of H, k € N, each possible coloring W = Wy bijectively corresponds to some 
function f : N > {1,...,n} through 


bases, falls apart). Given some enumeration (e¢ 


: k 
W;(e) = 1 ife=el; (6.230) 
W,;(e) = 0 otherwise. (6.231) 


Note that because of the total incompatibility of the projections, each e € Y)(H)-. 
belongs to a unique resolution (el), so that Wy is well defined. The statistical pre- 
dictions of quantum mechanics may then be recovered as follows. For each density 
operator p € Z(H) we may define a probability measure Up on the set nN of all 


functions f : N— {1,...,n} by imposing the conditions 


Bp ({f en | Wy(el) =A, vi=1,....n,ke K}) =[] Tr ( Tle? =4/"1), 
i=] 


keK 

(6.232) 
where a € {0,1}, K C Nis finite, and fel”? = a is the projection onto the cor- 
responding eigenspace H 4) of the projection el”) (more generally, for a € B(H) sa 
we write [a = A] for the spectral projection e, defined by a and A € o(a)). The 
subset of nN in the argument of Mp is hereby declared measurable; existence and 
uniqueness of the measure Up on a suitable o-algebra follow from the Kolmogorov 
extension theorem of measure theory, which applies because the marginals (6.232) 
satisfy the appropriate consistency conditions, cf. Hermens (2009) for details. 
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This formula guarantees that the left-hand side vanishes if a{® = 0 for each i, 


and also if a? = 1 for more than one value of i. If K = {ko} is a singleton and 
A = (Ay,..-,An), then the right-hand side (and hence the left-hand side) is the Born 
(ko 


probability for the outcome e; Nes A; for each i, i.e., 


Up (tre nN | W,(el") =A,Vi= iat) =Tr Qua -4i] . (6.233) 
i=l 


Consequently, it is true by construction that for any admissible measurement in 
quantum mechanics (in that all observables commute), i.e., for each kg € N, av- 
eraging over the ‘hidden variable’ f € n reproduces the statistical predictions of 
quantum mechanics. This success is achieved at a high cost, however: 

(k) (4’) 


e Two random variables e;” and e;, ° are statistically independent (with respect to 


Lp) whenever k # k’, even though low - elk ) || may be arbitrarily small. 
e For each f € nN the associated coloring Wy is maximally discontinuous, in that 


for each u € Y;(H), and each € > 0 there is u’ € (HH), such that although 
\leu — ew’|| < € one has Wy (eu) 4 Wy (ey), So that in fact |W (eu) — Wy (ey’)| = 1. 


These facts were noted by Clifton & Kent themselves, and Appleby (2005) proved 
that they are a necessary feature of all constructions that involve sufficiently large 
subsets of Y;(H) that can be colored. 

Without challenging their mathematical significance, these discontinuities un- 
dermine any potential physical relevance such models might have, and this in turn 
challenges the reason such models were introduced in the first place (Meyer, 1999), 
namely the (alleged) finite precision loophole of the Kochen—Specker Theorem. 

The thrust of this loophole is that it would be an illusion for an experimentalist 
like Alice to claim that she measures some observable a with infinite accuracy; 
in fact, given € > 0 she might equally well measure some a’ with ||a—a’|| < €. 
Consequently, finding a dense colorable subset Y|(H). C Y\(H) should suffice 
for a hidden variable interpretation of quantum mechanics, since if Alice believes 
she measures some projection e, the model assigns a value W(e’) to the projection 
e' € P\(H), she actually measures (where e’ is selected by some algorithm that 
is part of the theory itself, cf. Clifton & Kent (2000)), and presents that value to 
Alice as the outcome of her measurement. However, owing to the discontinuities 
just mentioned, this value is as arbitrary as the identification of e’. 

As emphasized by Barrett & Kent (2004), this arbitrariness, although perhaps 
undesirable, does not by itself affect the ability of the Clifton—Kent model to repro- 
duce the statistical predictions of quantum mechanics. On the other hand, it would be 
pretty awkward to have a theory whose individual value attributions are completely 
arbitrary, especially since the finite precision argument is predicated on the idea that 
observables close to the one Alice believes herself to measure (1.e., e) should have 
approximately the same value as the one she actually does measure (namely, e’). 
If this is not the case, her measurements are pointless and the hidden variable Wy 
would be empirically inaccessible and hence truly “hidden” (Appleby, 2005). 
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See also Hermens (2009, 2016). This last point applies to Corollary 6.12, which 
would no longer be true if the set X4 of all bases of R? in Definition 6.11 would be 
replaced by some subset X; C X,4 drawn from a colorable subset S? of S*. Each z € 
Xz would then correspond to some coloring u++ F(u,z) of $2, which, by the above 
discussion, would be maximally discontinuous and hence empirically inaccessible. 
Nonetheless, such a theory does exist in principle. 

The aim of maximizing colorable sets was pursued in a different direction by Bub 
& Clifton (1996); see also Bub (1997). Given a “preferred” observable a € B(H)sa 
and a pure state e € Y\(H), these authors look for a maximal sublattice Y(e,a) of 
YH) that contains all spectral projections of a (but, despite the notation Y(e,a), 
does not necessarily contain e!), admits sufficiently many lattice homomorphism 
h: Pie,a) + {0,1} (ie., binary valuations) such that the Born measure [de on 
o(a), ie., Ue(A) = Tr(ee,), A C O(a), can be reproduced by averaging over these 
homomorphisms, and finally is invariant under all unitary isomorphisms of 4(H) 
that commute with both e and a. Equivalently, one wants a maximal C*-subalgebra 
A(a,e) of B(H) that contains a, admits sufficiently many dispersion-free states so as 
to reproduce the Born probabilities defined by a in the given state e, and is invariant 
in the said way (a fourth condition used by Bub and Clifton is superfluous; see Bub, 
1997, p. 128). Asuming for simplicity that n = dim(H) < ©, the answer is 


A(a,e) = C*(e,ee,,A € o(a))’ (6.234) 


where, as always, eg is the projection into the eigenspace H, for A € o(a), and the 
prime denotes the commutant (one might as well take the commutant of the set of all 
e,ee,). Equivalently, putting e = ey = |y) (|, eq. (6.234) is the C*-algebra gener- 
ated by all projections f, onto the nonzero components e, y of y in each Hy and all 
one-dimensional projections that are orthogonal to all f, (given that dim(H) < ©, 
this is the same as the linear span of these projections). Thus A(a,¢) always contains 
C*(a), since it contains each e,, A € o(a)), but note that A(a,e) need not be com- 
mutative. In comparison, if the requirement had been the reproduction of all Born 
probabilities for arbitrary pure states e rather than for some given e, the answer 
would have been any maximal abelian C*-algebra in B(H) that contains C* (a); if a 
has non-degenerate spectrum, this is just C*(a) itself. The simplest possibility is 


A(1q,e) =C*(e)' = {e}’, (6.235) 


which is the linear span of all projections f € A(H) for which either e < f or 
e<ly—f Ge., if e = ey, then either ye fH or we (fH)+). In other words, we 
have a € A(1y,e) iff w is an eigenvector of a (i.e. the eigenvector-eigenvalue link). 
Each dispersion-free state on A(a,e), or, equivalently, each homomorphism hy : 
P(e,a) — {0,1}, corresponds to one of the projections f, through hy (f,) = 1 
and hy(f) =0 for all other one-dimensional projections f in A(e,a). The Born 
probabilities from e are then recovered by assigning (Born) measure Tr (ef; ) to hy. 
Though interesting, this result mainly supports so-called modal interpretations of 
quantum mechanics, which we reject, since they tell us nothing physical about the 
measurement process and address the measurement problem only philosophically. 
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86.2. The Free Will Theorem 

The Free Will Theorem was published in two versions by Conway & Kochen 
(2006, 2009). Analogous results had previously been published by Heywood & 
Redhead (1983), Stairs (1983), Brown & Svetlichny (1990), and Clifton (1993), 
of which only the first paper was cited by Conway and Kochen. Moreover, the 
close relationship to Bell’s (1964) Theorem might well be insisted on as a topic that 
should have been discussed in the original papers. Other critical literature (making 
the points listed in the preamble to this chapter) includes Bassi & Ghirardi (2007), “t 
Hooft (2007), Goldstein et al (2010), Wiithrich (2011), Hemmick & Shakur (2012), 
Cator & Landsman (2014), Hermens (2014, 2015), and Walleczek (2016). 

The original (Strong) Free Will Theorem (FWT) states that three assumptions, 
called SPIN, TWIN, and MIN, imply that the response of a spin-one particle to the 
bipartite experiment with spin-one particles described above ‘is not a function of 
properties of that part of the universe that is earlier than this response (...).’ Here 
SPIN and TWIN are the first and second half of our Nature axiom, whilst MIN ex- 
presses a form of context-locality as well as the loose assumption that Alice and 
Bob may ‘freely choose’ their settings a and b, respectively. Accordingly, in our 
notation, Conway and Kochen only use the parameter space Z, rather than the full 
space X we need in order to consistently axiomatize determinism. Their formulation 
contains an implicit assumption of determinism, whose precise nature only becomes 
clear from their proof, and which is akin to our formulation, except for the crucial 
difference that the function they allude to only acts on the particle variables and not 
on the settings of the experiment (of which, as already noted, Conway and Kochen 
just say that the experimenters can ‘freely choose’ them). 

Conway and Kochen paraphrase their theorem as follows: 


‘if indeed we humans have free will, then elementary particles already have their own small 
share of this valuable commodity. More precisely, if the experimenter can freely choose 
the directions in which to orient his apparatus in a certain measurement, then the particles 
response (to be pedantic—the universe’s response near the particle) is not determined by the 
entire previous history of the universe. (...) our theorem asserts that if experimenters have 
a certain freedom, then particles have exactly the same kind of freedom. Indeed, it is natural 
to suppose that this latter freedom is the ultimate explanation of our own. (...) Granted our 
three axioms [i.e., the physical ones and freedom of choice], the Free Will Theorem shows 
that nature itself is nondeterministic.’ 


However, such far-reaching conclusions seem unwarranted by the actual technical 
content of the theorem. Indeed, though it is also assumed in Bell’s first theorem (see 
86.5 below), the conjunction of Determinism and Freedom is a priori is uncomfort- 
able, especially since the main novelty of the FWT lies in the emphasis Conway and 
Kochen (unlike Bell) put on free will. The authors acknowledge at least this point 
already on the first page of their first paper (Conway & Kochen, 2006), in which 
they anticipate criticism of the kind: 


““T saw you put the fish in!” said a simpleton to an angler who had used a minnow to catch 
a bass.’ 


Indeed, also after more serious philosophical analysis, it has been concluded that: 
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‘Their [Conway & Kochen’s] case against determinism thus has all the virtues of theft over 
honest toil. It is truly indeterminism in, indeterminism out.’ (Wiithrich, 2011) 


Our formulation of the FWT, in which the original allusion to undefined free will in 
allowing arbitrary settings of the experiment has been replaced by complete deter- 
minism including the settings, avoids this criticism. 

To derive (6.35) - (6.38), we use (6.21) to write down the formulae 


(F =1,G;=1|A=a,B =b) 
»(Fi =0,G; =0|A =a,B =b) 
(F, =1,G; =0|A =a,B =b) 
(F, =0,G; = 1|A=a,B=b) 


Yo, (13 — |ui) (ul) @ (13 — |v 7) (vj1) Wo): 
Yo, |i) (ui| @ |v;) (vj | Yo): 

Yo, (13 — |i) (uj|) ® |vj) (Vil Yo); 

Yo, |wi) (ui| ® (13 — |vj) (vj) Yo). 


Bcd ee. LoS eS 


For example, for any pair of unit vectors u,v we have 


(Wo, [u) (ul @ |v) (v| Wo) = 


5 (€1 Sey +e2 Wen + €3 Bez, u| ® |v) (v| (ey Se; +e2 Ber +3 WE3)) = 
5(e] Ge; +e Ver +€3 Bez, (u, v)U ® v) 
= ;(u,v)?, 


which gives (6.36). The other cases are similar. 

The implications of the finite precision loophole of the Kochen—Specker Theo- 
rem for the Free Will Theorem were analyzed by Hermens (2014), who concluded 
that this loophole does not apply. We give a more precise argument to this effect. 

We have dense colorable subsets X; C X4 and Xp; C Xg = X4, where X; may 
or may not coincide with Xp. If not, the perfect correlation condition (6.54) in the 
Nature assumption cannot even be stated, but even if X4 = Xp, since finite precision 
of experiment has been declared to be an issue it would be quite out of character to 
impose (6.54). Instead, one needs a probabilistic version of this condition, of which 
it will turn out that it cannot be satisfied. As in the notes to the previous section, for 
each density matrix p one needs a probability measure [py on Z that reproduces the 
statistical quantum-mechanical predictions for the associated quantum state. Com- 
pared to the notes to the previous section, the role of W is now played by z, in that 
for given F and G one might write 


W(a,b) = (F(a,z),G(b,z). (6.236) 


This measure may be constructed analogously to (6.232), 1.e., for any sequence 
(a) of bases drawn from X‘¢, any sequence (b')) of bases drawn from X¢, and any 
sequences (A) and (y)) in A, cf. (6.22), where k € K C Nis arbitrary, we define 


Hp ({zeZ| F(a™,z) =A, GO,z) = yk K} = 


3 
Tt ( T]3, =4")-2, = #1) (6.237) 


keK ij=l 
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where, as in the main text, 


a = (U),U2,U3); (6.238) 
b = (v1, V2,V3). (6.239) 


Note that Ki acts on Alice’s Hilbert space C? whilst Je acts on Bob’s. In particular, 
for fixed kj € K and A,y € A, we have the special case of (6.237) for compatible 
measurements, Viz. 


3 
Up({z€Z | F(a™),z) =2,G(6),z) = y} =Tr ( I], = Ai] Vy, = a) ; 
LjJ= 


where in the main text we would have written P, (F =A,G = u|A =a,B =b) for the 
right-hand side. Hence for the correlated state p = | Wo) (Wo| we obtain from (6.42): 


Hayy ({Z EZ | F,(a,z) # G;(b,z)}) = $(1 _ (u;,v;)°), (6.240) 


which of course vanishes if u; = v;. If the expression 1 — (u;,v;)" appearing here is 
small, then the projections ey, and ey ; are close (in norm), since 


lleu, — ev lI? < 2(1 — (ui, vj)”). (6.241) 


Eq. (6.240) therefore allows us to make rigorous sense of Hermens’ (2014) heuristic 
idea that the assumption (6.54) in the FWT should be modified as follows: 


‘if ||eu; — ev; || is small, then in most of the cases F(a,z) = G,(b,2)? 
Namely, we replace (6.54) by the following approximate correlation condition: 


e For every € > 0 there is 5 > 0 such that if 1 — (u;,v;)* < 6, then 
Hyp ({z € Z | Fi(a,z) # Gj(b,z)}) <e. (6.242) 


Indeed, if the theory existed, on could simply take 6 = €. However, a theory satis- 
fying (6.242) does not exist, as can be proved by contradiction: if F;(a,z) = G;(b,z) 
for all pairs (u;,v;) such that 1 — (uj,v;)? < €, then the proof of Theorem 6.13 
shows not only that (6.32) still holds on the modified Nature assumption (so that 
F(-,z) again defines a coloring of S”), but that in addition we have 


1—(u,w)? <5 > F(u,z) =F(w’,z). (6.243) 


In particular, the apparently weaker correlation condition ending with (6.242) is 
actually stronger than its exact counterpart (6.54). 

Thus Theorem 6.13 still holds on this revised Nature assumption, so that unlike 
the Kochen—Specker Theorem, the Free Will Theorem is immune to the finite pre- 
cision loophole. The price for this immunity is that, quite against the spirit of the 
FWT, some probabilistic reasoning had to be invoked, so that the difference between 
the FWT and Bell’s first theorem has blurred even further. 
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86.3. Philosophical intermezzo: Free will in the Free Will Theorem 

The literature on free will is immense. Introductory accounts include Walter 
(2001), which focuses on the connection with neuroscience, Doyle (2011), and 
Beebee (2013), the second of which remains largely philosophical, the third even 
completely. A very sophisticated recent defense of compatibilism is Ismael (2016). 
Lewis’s ‘local miracle compatibilism’ was proposed in Lewis (1981). What’s more: 


‘[Lewis’s paper is] the finest essay that has ever been written in defense of compatibilism— 
possibly the finest essay that has ever been written about any aspect of the free will problem.’ 
(van Inwagen, 2008). 


Saunders (1968) already made a point similar to Lewis’s; see also Moore (1912, Ch. 
6). For Lewis’s theory of counterfactuals see Lewis (1973, 1979, 2000), as well as 
Menzies (2014). See also Fischer (1994), Beebee (2003, 2013), and Vihvelin (2013). 

Although Lewis’s position is called local miracle compatbilism, a miracle takes 
place neither in the actual world where Alice’s hand is at rest nor in the possible 
world where she raises it, i.e., a law is broken neither in the former nor in the latter: 


‘This is what Lewis means by a ‘miracle’: an event M is a miracle if and only if M occurs 
at possible world w, and M is contrary to some actual law (or combination of laws) L. The 
point here is that while / is a miracle in Lewis’s sense, it is not contrary to any of w’s laws 
of nature. At w, L simply isn’t a law in the first place. So, as things actually happened— 
in the actual world—L is a law, and m does not occur, so there is no miracle in the usual 
sense of ‘miracle’. m is only a ‘miracle’ in Lewis’s special sense of ‘miracle’: something 
(m) happens in w that is contrary to the laws of nature in the actual world.’ 

(Beebee, 2013, p. 62) 


Unfortunately, confusion may arise if the quotation in the main text ‘if I did it, a law 
would be broken’ from Lewis (1981) is subjected to the following explanation: 


‘On Lewis’s account of counterfactuals, the truth conditions for counterfactuals—what 
makes them true—are as follows. Suppose we have the counterfactual ‘if A had been the 
case, B would have been the case’ (so if A is ‘I miss the bus’ and B is ‘I’m late’, this coun- 
terfactual just says, ‘if ’'d missed the bus, I would have been late’). This counterfactual will 
be true if and only if, at the closest possible world to the actual world at which A is true, B 
is also true. So, our sample counterfactual, ‘if I’d missed the bus, I would have been late’, 
is true if and only if: at the closest possible world to the actual world at which I miss the 
bus, I’m late.’ (Beebee, 2013, p. 60). 


Removing any possible remaining doubt, on p. 62 she mentions that the closest 
possible world where I miss the bus is the world w. According to this explanation, 
then, Lewis’s sentence ‘if I did it, a law would be broken’, would mean that at the 
closest possible world to the actual world in which I did it, a law is broken, i.e., in w. 
But according to Beebee’s definition quoted in the main text of what Lewis means 
by a miracle, apparently this is not the right reading (and indeed it would, in our 
view, be nonsensical). Moreover, Lewis (1981) emphasizes that in the first bullet 
point in the main text above—which he defends—it is not the agent who would 
break a law, whereas in the second bullet point —rejected by Lewis—it is; in the 
first it is the breaking of some law at an earlier time that enables the agent to do 
what she, in our actual world, did not do. Thus Lewis’s phrasing seems awkward. 
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Our development of Lewis’s argument is indebted to Vihvelin (2013, pp. 164— 
165), who (re)states Lewis’s first bullet point as the following conjunction: 


1. Slightly Different Past: If I had raised my hand, the past would still have been 
exactly the same until shortly before the time of my decision. 

2. Slightly Different Laws: If I had raised my hand, the laws would have been ever 
so slightly different in a way that permitted a divergence from the lawful course 
of actual history shortly before the time of my decision. 


A second way in which Alice could (counterfactually) have raised here hand is 
through an instant (counterfactual) modification of the state of the world, as in Ben- 
nett (1984). This has been explicated by Vihvelin (2013, p. 165), too: 


1. Same Laws: If I had raised my hand, the laws would still have been the same. 
2. Completely Different Past: If I had raised my hand, past history would have 
been different all the way back to the Big Bang. 


Here we prefer to write Different Past, since even though in this scenario the state 
indeed (by determinism) would have been different all the way back to the Big Bang, 
the entire trajectory of the world may or may not be close to the actual one. In this 
scenario, the two cases Lewis distinguishes take the form in the main text. 


Since the main novelty of their papers lies in the emphasis on free will, the reader 
might wonder what Conway & Kochen themselves have to say about the subject. As 
we can read in the delightful biography of Conway by Roberts (2015), or watch in 
his video lectures on the Free Will Theorem (Conway, 2009), free will is indeed of 
great importance to at least the first author of the theorem. Unfortunately, his interest 
in free will seems unaccompanied by any philosophical sophistication, e.g.: 


‘Compatibilism in my view is silly. Sorry, I shouldn’t just say straight off that it is silly. 
Compatibilism is an old viewpoint from previous centuries when philosophers were talking 
about free will. The were accustomed to physical theory being deterministic. And then 
there’s the question: How can we have free will in this deterministic universe? Well, they 
sat and thought for ages and ages and ages and read books on philosophy and God knows 
what and they came up with compatibilism, which was a tremendous wrenching effect to 
reconcile 2 things which seemed incompatible. And they said they were compatible after 
all. But nobody would ever have come up with compatibilism if they thought, as turns out 
to be the case, that science wasn’t deterministic. The whole business of compatibilism was 
to reconcile what science told you at the time, centuries ago down to | century ago: Science 
appeared to be totally deterministic, and how can we reconcile that with free will, which 
is not deterministic? So compatibilism, I see it as out of date, really. It’s doing something 
that doesn’t need to be done. However, compatibilism hasn’t gone out of date, certainly, 
as far as the philosophers are concerned. Lots of them are still very keen on it. How can 
I say it? If you do anything that seems impossible, you’re quite proud when you appear 
to have succeeded. And so really the philosophers don’t want to give up this notion of 
compatibilism because it seems to damned clever. But my view is it’s really nonsense. And 
it’s not necessary. So whether it actually is nonsense or not doesn’t matter.’ 

(Conway, quoted in Roberts, 2015, pp. 361-362). 


Finally, our version of van Inwagen’s (1975) Consequence Argument is due to 
Beebee (2003), and the novel parts of this section are based on Landsman (2016c). 
For interesting philosophical criticism of this approach, see De Mola (2016). 
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86.4. Technical intermezzo: The GHZ-Theorem 

The GHZ Theorem appeared in Greenberger et al (1990) See also Clifton, Red- 
head, & Butterfield (1991) and Bub (1997). Innumerable variations on and gen- 
eralizations of such arguments may be given, leading to equally many Free Will 
Theorems. All of these have their roots in algebraic properties of matrices, which 
hidden variable theories (in vain) try to reproduce. 


86.5. Bell’s theorems 

The original contributions to the theme of this section are Bell (1964, 1976), of 
which the first is one of the most famous papers of 20th century theoretical physics. 
Since there are more than 10,000 papers citing Bell (1964) alone, it is impossible 
to discuss all literature relevant to Bell’s work. What we call his first theorem orig- 
inates with Bell (1964), which incidentally was written after Bell (1966), but our 
treatment of the settings (taken from Cator & Landsman, 2014) is different. Though 
originally motivated as an attempt to make the Free Will Theorem look less of a pe- 
titio principii, it also addresses a problem Bell faced even according to some of his 
staunchest supporters (Norsen, 2009; Seevinck & Uffink, 2011), namely the tension 
between the idea that the hidden variables (in the pertinent causal past) should on 
the one hand include all ontological information relevant to the experiment, but on 
the other hand should leave Alice and Bob free to choose any settings they like. 

His second theorem comes from Bell (1976), followed by Bell (1990a). 

Apart from his own papers, which are reprinted in Bell, Gottfried & Veltman (2001), 
treatments of Bell’s Theorems we regard as sound include Fine (1982), Jarrett 
(1984), Pitowsky (1989), van Fraassen (1991), Butterfield (1992a,b), Bub (1997), 
Werner, & Wolf (2001), Liang, Spekkens, & Wiseman (2011), Shimony (2013), 
Wiseman (2014), and Brown & Timpson (2015). Recent and mathematically inno- 
vative approaches include Abramsky & Brandenburger (2011), Acin et al (2015), 
and Fritz (2016). For history, see Gilder (2008) and Kaiser (2010). 

Unfortunately, we have not been able to come to grips with (and hence do not 
cite) literature claiming that Bell’s theorems are false, or have nothing to do with 
hidden variables, or prove that quantum mechanics (if not nature itself!) is nonlocal 
per se, or that he never changed his mind and only has one theorem saying it all. 


The verification of (6.102) - (6.105) is analogous to the above computations de- 
riving (6.35) - (6.38). In terms of the unit vector 


Va = (a) (6.244) 


sind 


the observable F Alice measures on setting A = a is the projection eg = |va) (Val, 
and similarly for Bob. Hence the corresponding Born probabilities are given by 


Py) (F =1,G = 1|A =a,B=b) = (Wo, €a®enWo); 
Py) (F =0,G=0|A =a,B =b) = (Wo, (12 — ea) ® (12 — en) Wo); 
Py(F =1,G=0/A =a,B =b) = (Wo, ea® (12 — en) Wo); 
Py ((F =0,G = 1|A =a,B=b) = (Wo, (12 — ea) BepWo). 
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For example, we have 


(e; Gey + €2 BEd, |Va) (Val @ |vy) (vp|(€1 © 1 + €2 Bz) 
(e; @e; +e: Bep, (cosacosb+sinasinb) vg @ vp) 


(Wo,€a® enWo) = 


(cosacosb + sinasinb)* 


cos”(a—b). 


NI NIE NR NIE 


The CHSH-inequality (6.117) is due to Clauser, Horne, Shimony, & Holt (1969). 
The definitive (i.e., loophole-free) experimental verification of its violation in nature 
is Henson et al. (2015). A direct proof starts of (6.117) from the simpler inequality 


P(F 4H) <P(F 4G)+P(G#H), (6.245) 


for three {0,1}-valued random variables F,G,H, which implies (6.117). To prove 
(6.245), one just writes 


P(F £H) = P(F =1,G=1,H =0)+P(F =1,G=0,H =0) 
+ P(F =0,G=1,H =1)+P(F =0,G=0,H =1), 


etc., and notes that each term on the left-hand side of (6.245) also occurs on the right- 
hand side. Since each term lies in [0,1] and hence is positive, this implies (6.245). 
Our proof of Proposition 6.17 follows Werner & Wolf (2001), as does our proof of 
Theorem 6.18 (though not our formulation thereof, which once again derives from 
Cator & Landsman (2014). This proof shows that, as first noted by Fine (1982) and 
analyzed more deeply in Butterfield (1992b), there is no real distinction between 
the possibility of reproducing given (empirical) probabilities P(F =2,G = y|A = 
a,B = b) that satisfy Bell locality by a local deterministic hidden variable theory or 
by a local stochastic hidden variable theory. Most current research in this direction, 
sparked by Popescu & Rohlich (1994), is therefore concerned with theories defined 
by formal joint conditional probabilities that satisfy a no signaling condition like OI 
instead of Bell locality, cf. Bub (201 1b) and Brunner et al (2014) for reviews. 

Formal conditional probabilities of the kind that Bell’s second theorem uses have 
been axiomatized by e.g. Popper (1938) and Rényi (1955); the following axioms are 
theorems if conditional probabilities are defined 4 la Kolmogorov by (1.1). Let Y be 
some O-algebra and let  C ©\{O} be an ideal in Y in the sense that if B € Y and 
Ce F, then BNC € ¥. A conditional probability on (2, ) is a map 


P:<x Ff = (0,1); (6.246) 
(A,C) > P(A|C), (6.247) 
such that: 


1. For each C € ¥ the map A++ P(A|C) is a probability measure on LZ; 
2. P(ANB|C) = P(A|BNC) -P(B|C), for each A,B € XY andCe F. 
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Van Fraassen (1991) noted that if (6.121) holds, then the variable x is a common 
cause in the sense of Reichenbach for Alice’s and Bob’s outcomes (see Hofer-Szab6 
(2015) for a recent paper in this direction). To explain this observation, suppose two 
random processes F and G (like Alice’s and Bob’s measurements) are correlated, 
ie., P(F =A,G=yY) £P(F =A)P(G= 7). What might cause the correlation? 


1. Chance. If Alice and Bob independently throw dice but always get the same 
result, there is a computable nonzero probability for this to happen without any 
reason. But this probability decreases as the number of occurrences grows. 

2. Causation. One outcome influences or even determines the other. Maybe Bob, 
whose experiment is genuinely random, is able to manipulate Alice’s experiment 
once he has seen his outcome. But according to relativity theory or other basic 
notions of causality in space-time, this should be impossible if Alice and Bob 
perform their measurements simultaneously and far from each other. 

3. Ur-determinism. The initial conditions at the Big Bang plus deterministic Laws 
of Nature imply the correlation. However, physics becomes pointless if we en- 
dorse this option. The notion of explanation as the purpose of science is defeated 
and there is little difference between this argument and Divine Predestination. 

4. Identity. The motions of my mirror image are strongly correlated with me, but 
that is because this image is really the same as me (at least in so far as motion is 
concerned, as opposed to e.g. thoughts). This example might also be explained 
using causation. Another example consists of Alice and Bob filming the same 
random process (which may also be explained using the following concept). 

5. Common Cause A random process X is said to be a common cause for two 
correlated random processes if it precedes both and satisfies 


P(F =4,G=y|\X =x) =P(F =A|X =x)P(G=YX=x). — (6.248) 


Another way to write this is P(F =A|G = y,X =x) =P(F =A|X =x), which 
shows that a common cause X screens off the dependence of F on G. Often the 
common cause is hidden and has to be inferred from the observed correlation 
(having excluded other explanations, like the ones above). A nice example of 
this is the inference of a manuscript called Q in New Testament studies. It is 
clear that the Gospels of Matthew and Luke both draw on Mark, but they also 
contain strikingly similar or even identical non-Markan passages. For various 
reasons it is unlikely that either one copied these from the other, so that the main 
hypothesis is that they both rely on Q, which is now lost. See e.g. Mack (1993). 


From this perspective, the amazing fact is that the correlations in the Alice and 
Bob experiment with either spin-1 particle or photons cannot be explained by a 
common cause, since its existence (in the form of x) would imply the Bell inequality. 
However, of the four other explanations described above, no. | is ridiculous given 
the statistics of the relevant experiments, no. 2 is at odds with relativity, and no. 
4 seems inapplicable. This leaves no. 3, which seems only supported by *t Hooft 
(2016), who denies the independence assumptions (i.e. between the settings and the 
state of the pair of particles undergoing measurement) lying at the basis of both the 
Free Will Theorem and Bell’s theorems. Every way you look at it you lose! 
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Generalizations of Theorem 6.19 to operator algebras were given e.g. by Baez 
(1987), Raggio (1988), Werner (1989), and Bacciagaluppi (1993), as follows. Let A 
and B be unital C*-algebras, with projective tensor product A®B (i.e., the comple- 
tion of the algebraic tensor product A ® B in the maximal C*-cross-norm), cf. §C.13; 
the choice of the projective tensor product guarantees that each state on A ® B ex- 
tends to a state on A&B by continuity; conversely, since A ® B is dense in A®B, each 
state on the latter is uniquely determined by its values on the former. In particular, 
product states p ® o and mixtures @ = Y;; pip; ® 0; thereof are well defined on A®B. 
If A C B(A;) and B C B(H2) are von Neumann algebras, and all states considered 
are normal, it is easier to work with the spatial tensor product A®B, defined as the 
double commutant (or weak completion) of A ® B in B(H, ® Hz). Any normal state 
on A®B extends to a normal state on ASB by continuity. Below we use ®&, but the 
results also work for &. In what follows, A and B are unital C*-algebras. 


Definition 6.23. Let w be a state on A®B. 


1. A product state is a state of the form ® = P ® O, i.e., @ is defined by linear (and 
continuous) extension of @(a®b) = p(a)o(b). 

2. A state @ is uncorrelated when it is in the w*-closure of the convex hull of the 
product states on A&B. In particular, states @ = Y;; pip; ® 0;, where p; > 0 and 
Y; pi = 1, are uncorrelated (w*-convergent infinite sums are allowed here). 

3. A state is correlated when it is not uncorrelated. 


An uncorrelated state @ is pure precisely when it is a product of pure states. This 
has the important consequence that both its restrictions @4 and @)g to A and B, 
respectively, are pure as well (the restriction @,4 of a state @ on A®B to, say, A is 
given by @,4(a) = @(a® 1g), where 1 is the unit element of B, etc.). A correlated 
pure state has the property that its restriction to A or B is mixed. 


Proposition 6.24. The following conditions are equivalent: 


e Each state on A®B is uncorrelated; 
e Each pure state on A&B is a product state; 
e Atleast one of the C*-algebras A and B is commutative. 


For the proof see Takesaki (2002), Theorem 4.14. 
Corollary 6.25. Correlated states exist iff A and B are both noncommutative. 

As one might expect, this result is closely related to the Bell inequalities: 
Proposition 6.26. For any @ € S(A®B), the following conditions are equivalent: 


e @ is uncorrelated. 
e For all self-adjoint operators a,,az € A and b,b2 € B of norm < 1 we have 


|@(a1(by +b2) +.42(bi —b2))| <2. (6.249) 


See Baez (1987), Raggio (1988), Bacciagaluppi (1993), and Landsman (2006a). 
Corollary 6.27. If A or B is commutative, then (6.249) holds for all states @. 
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An elegant geometric approach to the Bell inequalities was developed by Pitowsky 
(1989, 1994), which we now summarize (also cf. Werner & Wolf, 2001). 

Suppose we have a bipartite experiment with m different settings A = a1,...dm 
and B = b,...,b on each wing, and binary outcomes, i.e., in {0,1}. We now de- 
note the probability P(F = 1|A =a;) that F (a;) (ie. the particular property measured 
by experiment F at setting aj) is true by p; (i= 1,...,m), and likewise we write pj+m 
for P(G|B = b;), i.e., the probability that G(b;) is true, once again for j = 1,...,m. 
Furthermore, we abbreviate the probability that F(a;) and G(b;) are both true by 


Pi jam =P(F =1,G =1|A=a;,B=b)) (i,j=1,...,m). (6.250) 


The 2m +m? “surface probabilities” p = (p1,... ,P2m;P1m+1;+++;Pm2m) forma vec- 
tor in Rent which we wish to constrain by the following assumption: there 
is a fact of the matter underlying each experiment according to which the pair 
(F (a;),G(b;)) already had a truth value for each possible setting (a;,b;), indepen- 
dently of any measurement being carried out or not (“local realism’). Thus the 
probabilities p (which now arguably have an ignorance interpretation) must lie in 
the convex polytope in R2+m*| defined as the convex hull Cin of the following set 
of (extreme) points: for each 2m-tuple A = (A1,...,A2m), where A; € {0,1}, define 


Gli oe eA as eR. (6.251) 


i.e., the entry at place k is Ay (k = 1,...,2m) and the entry at place (i, j) is Ai: Am+j, 
where i, j = 1,...,m. The interpretation of this is that x, represents the particular 
fact of the matter where F(a;) has truth value A; and G(b;) has truth value An+;, 
so that their conjunction (F(a;),G(b;)) has truth value A;-Amn+j. In this state the 
probability of the said configuration is one and all other states have probability zero; 
arbitrary probability assignments then lie in C,,. The point, then, is to characterize 


the convex polytope Cy, C R2m-+m” through a finite set of inequalities, which turn 

out to be generalized Bell inequalities. Seeing this result requires some background. 
Let V be a real topological vector space with (continuous) dual V*; if V = R” we 

may also put V* = R” and write @(v) as an inner product (@,v) in what follows. 


1. Any (not necessarily convex) subset S§ C V has a polar S° C V* defined by 
S°={gEV* | g(r) <1WeES}, (6.252) 
which is a closed convex subset of V*. If S = K is a compact convex set, we have 
K°={@~EV*|Q(v) <1 WEAK}. (6.253) 
2. The bipolar theorem (cf. e.g. Simon (2011, Theorem 5.5) states that 
S°° = co(SU {0}). (6.254) 


In particular, if K a closed convex set containing the origin, then 
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K”’ =K, (6.255) 

and hence, if K° is a compact convex set, we may reconstruct K from K° as 
K={veEV|@(v) < 1V@ € aK*}. (6.256) 


3. In particular, if K is a convex polytope in a finite-dimensional vector space con- 
taining the origin, then so is K°. In that case, 0,K° is a finite set and so points in K 
are characterized by a finite set of linear inequalities (6.256), which describe the 
faces of the polytope. In this case, the associated (dual) description of K is called 
the Minkowski-Weyl Theorem, see e.g. Paffenholz (2010) for applications. 


For example, among the five Platonic solids (i.e. in R*) the cube and the octahedron 
are dual to each other, as are the dodecahedron and the icosahedron, whereas the 
terahedron is self-dual. A propos, the latter arises as the convex polytope C; for 
m = | in the above story: clearly 2m +m? = 3, and for the vertices of C, one takes 
the four points x, ensuing from the four possibilities A = (0,0), (1,0), (0,1), (1,1), 
ie., x, = (0,0,0), (1,0,0), (0, 1,0), (1,1, 1). Then the inequalities in (6.256) are 


Pi229, pi> Piz, p2>Pi2, pit p2-Pi2<1. (6.257) 


For m = 2 the ensuing convex polytope C> C R® is the convex hull of 16 extreme 
points, whose inequalities may be found in Pitowsky (1989, p. 27); these imply the 
CHSH inequality, whose violation in quantum mechanics therefore shows that the 
probabilities in question have no local realistic model. 

More generally, suppose we have n yes-no experiments (£),...,E,) and some 
subset S,, of the set {(i,k) | 1 <i<k <n} (above we had n = 2m, E; = F (a;) for 
i=1,...,m, En+j =G(b;) for j=1,...,m,andS, = {(i,m+j)|1<i, j < m}). This 


gives surface probabilities (p1,..., Pn, Pix), Where (i,k) € S,), which form a vector 
p in R’*!S*!, As in (6.251), each truth assignment A = (Ay,...,An), A € {0, 1}, then 
defines a point x, € R”+1Sn| with coordinates (A1,.--;An, Ai: Ax), where once again 


(i,k) € S,. This set of 2” points in turn spans a convex polytope Cs, characterized 
by inequalities following from the dual characterization (6.256). Classical thinking 
would constrain the p so as to lie in Cs, and indeed we have p € Cy, iff there is a 
probability space (X,G,1) such that pj = (Aj) and pix = “(Aj Ax) for certain 
events A; € 2, cf. Theorem 2.3 in Pitowsky (1989), which is based on Fine (1982). 
Some authors claim on this basis that Bell-type inequalities have nothing to do 
with physics, but surely the point is that some physical assumptions (notably local 
realism) have to be made in order to justify the “classical thinking” behind Cs, . 


$6.6. The Colbeck—Renner Theorem 

This section is based on Colbeck & Renner (2011, 2012a, 2012b), where the 
main idea originates (alas with unclear assumptions and at best heuristic “proofs’’), 
Braunstein & Caves (1990), who provided steps | and 2 of the proof, and Landsman 
(2015), whom we follow closely. See also Leegwater (2016) for a technically dif- 
ferent approach (by a far more complicated argument, Leegwater seems to manage 
to do without our CP assumption, i.e., continuity of probabilities). 


Chapter 7 
Limits: Small / 


Limits are essential to the asymptotic Bohrification program. It was recognized at 
an early stage in the development of quantum mechanics that the limit h + 0 of 
Planck’s constant going to zero should play a role in the derivation of classical 
physics from quantum theory, and later on also the thermodynamic limit (which 
often means “limy_,..”, where N is the number of particles in the system) became a 
subject of interest in quantum statistical mechanics. The conceptual status of these 
limits will be discussed in Chapter 10; in the present one we mainly explain the 
underlying mathematics. However, one question needs to be addressed immediately, 
since it is a source of much confusion. Varying N seems a realistic thing to do in the 
lab or on paper, whereas fi is a constant, so how can it be varied? The answer is that 
h is a dimensionful constant, from which one forms dimensionless combinations 
of h and other parameters; this combination then re-enters the theory as if it were a 
dimensionless version of fi that can indeed be varied. The oldest example is Planck’s 
radiation formula Ey/Ny = hv/(e"Y/‘" — 1), with temperature 7 as the pertinent 
variable. Indeed, the observation of Einstein and Planck that in the limit #v/kT > 0 
this formula converges to the classical equipartition law Ey/Ny = kT may well be 
the first use of the # — 0 limit of quantum theory; note that Einstein put hv/kT — 0 
by letting v — 0 at fixed T and h, whereas Planck took T — at fixed v and h! 
Another example is the Hamiltonian h = — EX +V (x) in the Schrédinger equa- 
tion of non-relativistic quantum mechanics, where m is the mass of the pertinent 
particle. Here one may pass to dimensionless parameters by introducing an energy 
scale € typical of H, like € = sup, |V(x)|, as well as a typical length scale ¢, such 
as ¢ = €/sup,|VV(x)| (if these quantities are finite). In terms of the dimensionless 
variable ¥ = x/¢, the rescaled Hamiltonian h = h/e is then dimensionless and equal 
toh=—fr A +V(%), where h = h/€,/2me, the operator A is the Laplacian for <, and 
V() = V(¢%)/e. Here h is dimensionless, and one might study the regime where it 
is small. Similarly, it is often realistic to rescale the potential V by a positive number 


72 
A, in which case hy = ear +AV (x) can be rescaled to hy /A = -LA+V(x), 


m m 


with h = h/ V/A, so that the “large V limit” A —> co comes down to h > 0. 
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In (older) textbooks on quantum mechanics the limit i — 0 is typically studied 
using the so-called WKB-approximation. This may be justified on historical grounds, 
but in fact this approximation is rarely applicable, and is extremely delicate even 
when it applies. Fortunately, a much more satisfactory and almost universally appli- 
cable framework has become available since the 1990s, namely (strict) deformation 
quantization, where the word “strict” (which we will henceforth omit) refers to the 
fact that in this approach fi is a real number that can “really” (!) be varied and hence 
can be made small (as opposed to formal deformation quantization, where hi is a for- 
mal parameter having no actual value). Also, “strict” sometimes refers to the use of 
C*-algebras and the high mathematical standards this brings. In the formalism that 
follows, (deformation) quantization and the classical limit of quantum mechanics 
are seen as two sides of the same coin, as the axioms of quantization are predicated 
on recovering the correct classical limit, while conversely the classical limit only 
makes sense in the context of some correct notion of quantization. 

The starting point of deformation quantization is a phase space X, mathemat- 
ically described as a Poisson manifold, i.e., a manifold equipped with a Poisson 
bracket {-,-} on its algebra of smooth functions C*(X), see §3.2. We recall that 
a Poisson bracket is a Lie bracket on C*(X) with the additional property that for 
each h € C*(X), the map 6),(f) = {h, f} is a derivation of C*(X) with respect to its 
structure as a commutative algebra under pointwise multiplication, i.e., 


On(F8) = FOn(8) + On(f)s- (7.1) 


Furthermore, like pointwise multiplication, the Poisson bracket preserves real- 
valuedness, ie., if f € C°(X,R) and g € C”(X,R), then also {f,g} € C°(X,R). 
As early as 1925, Dirac noted the formal analogy between Poisson brackets 
of functions on phase space and commutators of operators on Hilbert space (i.e., 
[a,b] = ab — ba). Indeed, if A is any C*-algebra, the commutator is a Lie bracket on 
A, and if we use [a,b] = ilab — ba}, then also self-adjointness is preserved (in that 
a* =a and b* = b implies that also [a,b]' is self-adjoint, which fails to be the case 
for the commutator itself unless it vanishes). Thus [—,—]' is a Lie bracket on Aga. 
Moreover, if for fixed a € A we define 5,(b) = [a,b]’, then we have the product rule 


5, (bc) = 54(b)c + b8,(c), (7.2) 


which makes 6, : A — A a derivation. A problem arises if one wishes to restrict 6, 
to Asa, since this subspace is not stable under multiplication. This may be remedied 
by passing to the Jordan product (5.14), ie.,aob = }(ab+ ba), which is defined on 
Aga. If a* =a, then 6, : Asa + Asa Satisfies the rule (7.2) also with respect to o. 

All this remains true if [—,—]’ is rescaled by a nonzero real number. Which num- 
ber this should be was suggested by Schrédinger’s construction of momentum and 
position operators on the Hilbert space H = L?(IR) through the substitutions 


hd 
p= —-—,; 7.3 
i dx Ge) 


~~ G=x, (7.4) 
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66.99 


where “x” is the multiplication operator miq (with id(x) = x), ie., Gw(q) = x(x); 
for the moment we will not be bothered by the fact that these operators are un- 
bounded; let us say they are both defined on the domain C?(R) C L?(R). 

This yields the canonical commutation relations (which formally hold on C?(R)): 


[9,4] = 1a, (7.5) 


Noting the Poisson brackets (in which p,q are the coordinate functions on X = R?) 


{p,q} = 1x, (7.6) 


it it clear that analogy should be between {—, —} and (i/h)|—,—]. Thus Dirac wrote: 
‘The strong analogy between the quantum P.B. defined by [(i//) times the commutator] and 
the classical P.B. (...) leads us to make the assumption that the quantum P.B.’s, or at any 


rate the simpler ones of them, have the same values as the corresponding classical P.B.’s.’ 


Combined with Heisenberg’s decisive idea that quantum mechanics should be an 
Umdeutung (1.e., reinterpretation) of classical mechanics, one is led to the idea that 
“quantization” should be given by a linear map 


fH O(f), (7.7) 


where f is some (smooth) function on phase space X and Q;(f) is some operator 
on some “corresponding” Hilbert space, whose identification or construction is a 
separate problem (but for X = R? it should apparently be L*(R)), such that 


= [On(),On(s)] = Onl f.8)) 7.8) 


at least for functions f, g € C*(X) with ‘the simpler’ Poisson brackets. If only to do 
justice to Schrédinger’s example (7.3) - (7.4) with (7.5), one should also require 


On (1x) = 1x. (7.9) 


The act of quantization should also preserve the adjoint, i.e., writing f*(x) = f(x), 


On(f*) = Qn(f)*- (7.10) 


Putting # on the right-hand side of eqs. (7.5) and (7.8), Dirac (and similarly the 
Dreimdnnerarbeit Born—Heisenberg—Jordan) concluded from these equations that: 


‘classical mechanics may be regarded as the limiting case of quantum mechanics when h 
tends to zero.’ 


In the remainder of this chapter we try to do justice to this fabulous insight of Dirac’s 
(and also of Born, Heisenberg, and Jordan, or even Planck, Einstein, and Bohr, none 
of whom seem to have quite appreciated the stupendous complexity of the claim). 


250 7 Limits: Small fA 


7.1 Deformation quantization 


Recall Definition C.121 of a continuous bundle of C*-algebras over some space /, 
which below is taken to be a subset of the unit interval [0,1] that contains 0 as an 
accumulation point (so one may have e.g. J = [0, 1] itself, or 7 = (1/N) U {0}). 


Definition 7.1. A deformation quantization of a Poisson manifold X consists of a 
continuous bundle of C*-algebras (A, {@_:A — An}ner) over I, along with maps 
On: Ao > An (hel), (7.11) 


where Ag is a dense subspace of Ay = Co(X), such that: 


I. Qg is the inclusion map Ap © Ao; 
2. Each map Qy, is linear and satisfies (7.10); 
3. For each f € Ag the following map is a continuous section of the bundle: 


O14 f; (7.12) 
hr+ On(f) (h > 0); (7.13) 


4. For all f,g € Ao one has the Dirac—-Groenewold-Rieffel condition 


tim | (On(1).Qn(e)]-On(F8))] =o. (14) 


It follows from the definition of a continuous bundle that continuity properties like 


lim ||Qn(f)|| = [Iles (7.15) 
lim ||Qn(f)Qu(g) — OQn(f8)|| = 0, (7.16) 


are automatically satisfied. Let us note that condition (7.9) is absent from this defi- 
nition, because ly ¢ Co(X) whenever X is not compact, in which case typically also 
the C*-algebras Aj, have no unit (see below). However, the given conditions turn out 
to be sufficiently powerful to produce the “right” examples. We give one of the main 
such examples without proof (the underlying analysis is quite forbidding). We put 


Ao = Co(T*R"); (7.17) 
An = Bo(L’(R")) (h > 0), (7.18) 
where T*R” & R2" carries the canonical Poisson structure (3.34), and Ay, is the C*- 


algebra of compact operators on the familiar Hilbert space L” (IR”) of wave-functions 
on R”. For the sake of completeness we also mention that 


A=C*((R" x R")’) (7.19) 


is the (reduced) C*-algebra of the tangent groupoid (IR” x IR")! to the pair groupoid 
R” x R" on R", see §8C.16,C.19, where one may also find the maps @p. 
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Let us summarize the situation. Continuity of the limit # — 0 is hard to envisage 
if one merely has the classical phase space X = T*R” and the quantum Hilbert 
space L?(R”) in mind. However, the move to either: the underlying Lie groupoids 
TR" and R” x R", which jointly comprise the smooth tangent groupoid R” x R")’, 
or: the corresponding canonically defined C*-algebras Co(T*IR”) and Bo(L?(R")), 
which are glued together as a continuous bundle (7.17) - (7.19), does give rise to a 
satisfactory structure that makes the limit # — 0 “continuous”. 

In this example, various possibilities for the quantization maps Qy» arise. As ex- 
plained in §C.19, the groupoid structure underlying (7.17) - (7.18) suggests Weyl’s 
prescription (C.549), which for convenience we reproduce: 

n n 
OF Ne) = [ere OWOME+3).P), 2.20) 


where f lies in the image of C?(7TR") under the fiberwise Fourier transform (C.547). 


This image, then, is the space Ag in Definition 7.1. We may rewrite (7.20) as 


oy _ d" pd"q - 
OF N= fh, Gene MaP)ON (oP). (7.21) 


where the operators in the integrand are given by 
ON (q, p(x) = 2%e7PO—O/Py(2g — x). (7.22) 
The purpose of (7.21) is that for each y € L?(IR”) we then obviously have 


d" pd"q 
(2ah)" 


wok (nw) =f P4 ra, pW (p.4). (7.23) 


where Ww, : T*R" — R is the Wigner function, given by 


W,! (p.4) = 2" (y, QF (a, PW) (7.24) 
= i d"ve'”’ (q+ thy) y(q— th). (7.25) 
IR” 


If || y|| = 1, then W,” gives a “phase space portrait” of the corresponding pure state 
éy on Bo(L?(R)). However, this portrait cannot be interpreted as a probability den- 
sity on T*R"”, since the Wigner function is not necessarily positive. This reflects a 
problem with Weyl’s quantization map ov itself (at fixed h > 0). We say that Q, as 
introduced in (7.11) is positive if, for each f € Ag C Ap (seen as a C*-algebra), 


f20 => Oi(f) 20, (7.26) 


where positivity of Q;(f) is defined in the C*-algebra Aj, (which in the case at hand 
is Bo(L?(R"))). This is not the case for Q)”. Moreover, Q}” fails to be continuous, 
and for this reason it cannot be extended to Apo (at least not in the obvious way, viz. 
by continuity). Fortunately, both problems can be resolved by a change in Qp. 
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A strict deformation quantization of R* that is positive exists under the name 


of Berezin quantization, denoted by oF. However, the fundamental idea of the un- 
derlying coherent states goes back to Schrédinger. For each (p,q) € R* and h > 0, 


define a unit vector 6? Dep (R), called a coherent state, by 
oP) (x) = (ah) —?/4¢-ipg/ 2h gipx/hig— (x4) [2h (7.27) 
Writing z = p +ig, the transition probability between two coherent states is 
OP oe Pee eek. (7.28) 


In terms of these coherent states, we define Q? : Co(T*IR")  Bo(L?(R")) by 


d" pd” 
oh(A) =f PET ro.adlon?? (0)? | (7.29) 


TR" 27h 
where the integral is meant in the sense that for each y,  € L?(IR”) we have 


d" pd"q 
27h 


(@.2n(fv) = [, F(P.D.On) (O°?) (7.30) 


In particular, for each unit vector y € L?(IR”) we may write 


(WQn(f)W) = Ts dpy f, (7.31) 


where [My is the probability measure on 7*IR” with density 


BY (p,q) = |(0" WP, (7.32) 


called the Husimi function of y € L”(IR”); in other words, My is given by 


d" pd” 
duty(p.q) = BY (p.4). (7.33) 


Weyl and Berezin quantization are related in many ways, for example, by 


h 
On(f) = On (ef), (7.34) 

where A>, = Ye4 (0? /d P; + 07/0(q/)*), from which it follows that Weyl and 

Berezin quantization are asymptotically equal in the sense that for any f € Ao, 


lim ||On(f) — Qn (f)|| =. (7.35) 


Indeed, this provides one way (among various others) of proving that oF satisfies 
Definition 7.1, where we note that even though oF is defined on all of Co(T*R"), 
eq. (7.14) only holds on a suitable dense subspace thereof, such as C}(T*R”). 
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7.2 Quantization and internal symmetry 


In the presence of symmetries, Dirac’s condition (7.8) can often be met by suitable 
functions f and g related to the symmetries in question, though such functions may 
be unbounded. This blasts the C*-algebraic framework, but it does so in a controlled 
way. We start with internal symmetries, like spin, which will be coupled to motion 
in the next step. Let G be a Lie group with Lie algebra g, to which we associate: 


e The “classical” Lie—Poisson manifold g*, see (3.98), whose Poisson bracket we 
now preface with a minus sign, so that instead of (3.98) and (3.99) we now have 


A f(O) Ag(8) 


{f,g}-(@) = —C%, 6, 6, 30)” (7.36) 
{A,B}- = -(A,B}. (7.37) 


We write g* for this Poisson manifold. 
e The “quantum-mechanical” reduced group(oid) C*-algebra C;(G), cf. §C.18, 
defined as the norm-closure of 2(C2(G)) within B(L7(G)), where 


a fly = fe: (7.38) 

Fewts) = f ayFovywo), (7.39) 

where f € C?(G) and y € L*(G), cf. (C.481), and dy is Haar measure on G 
(which also provides the measure defining the Hilbert space L?(G)). 


We then obtain a continuous bundle of C*-algebras, with fibers and total C*-algebra 


Ao = C;(g); (7.40) 
An = C2(G) (h > 0); (7.41) 
A=Ci(G*), (7.42) 


where g is seen as an abelian Lie group under addition, cf. Theorem C.123. We have 
C;(g) = Co(g"), (7.43) 


which isomorphism (i.e. of C*-algebras) is given by the Fourier transform 


f(@) = i, d"Ae (A) Fra); (7.44) 
es d"O :9(a) 
F(a) = jf ape FO) (7.45) 


where initially f € C?(G), and the map f +> f is subsequently extended to C*(G) 
by continuity. Here the normalization of Lebesgue measure d”A on g is arbitrary, but 
the normalization of d”@ is thereby fixed. In what follows, we take a (left-invariant) 
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Haar measure dx on G and fix the normalization of d”A by the condition 

J(0) =1 (7.46) 
in the definition of the Jacobian under the exponential map exp : g > G, i.e., 


d(exp(A)) 


A = 
(a) = SP 


(7.47) 


With Ap = C2(g), the quantization map Q, : C?(g) + C*(G) is then given by 
On(f)(e*) =A-"F(A/A), (7.48) 


where n = dim(G) and we assume that f > 0 is small enough that f times the sup- 
port of f € C*(g) is contained in an open neighbourhood U of 0 € g where the 
exponential map is a diffeomorphism onto some open neighbourhood U’ of e € G; 
otherwise a cutoff function should be included. Equivalently, defining Ay C Co(g* ) 
as the image of C>(g) under the Fourier transform f > f (which consists of the 
so-called Paley—Wiener functions on g*), the map Q;, : Aj > C*(G) is given by 


do; 
AY = i6(A)/h eg 7.49 
nA) = | aaa” £0). (7.49) 
Although these maps satisfy (7.14), if Gis non-abelian there are no natural functions 
on g* whose quantizations satisfy the exact Dirac condition (7.8). This is a limitation 
of the C*-algebraic framework, since candidate functions like 


A:g* oR; (7.50) 
A(@) = @(A), (7.51) 
whose Poisson brackets (3.99) are promising, are unbounded. However, this is eas- 
ily remedied by regarding C;(G) as an algebra of bounded operators on the Hilbert 
space L?(G)—which indeed is the way it was originally defined—rather than ab- 
stractly. This “spatial” context allows the passage to the Lie algebra, as reviewed in 
§5.6, see especially (5.156) - (5.161). First note that (7.38) - (7.39) is a special case 
of (5.172), where H = L’(G) and u = uz, 1.e., the left-regular representation 


u(y) w(x) = y(y |x). (7.52) 


In this representation, the construction (5.156) then realizes g as right-invariant dif- 
ferential operators on the Garding domainDg C C”(G). By definition of C*(G), 
seen as an operator on L7(G) the function Q;(f) is given in coordinates by 


On(f) = [arxsoo | oni lO) £(Q) ur [o# (Exn)) . (7.53) 
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Here (X1,...,X,) in (7.53) are coordinates on g defined by a basis choice (T1,...,T,), 
ie., A = );X;,7;. The function T; on g* is then simply given by the coordinate func- 
tion T;(0) = 0;. Now take A € g and assume that f =A. This function is unbounded, 
but the following formal calculation is rigorously correct on the Garding domain and 
may be justified by some distribution theory. For simplicity we assume that G is uni- 
modular, in which case J(X) = 1 + O(X7) as X — 0, so that all first derivatives of J 
vanish at X = 0. Taking f = T; in (7.53) then gives 


x : do; 
O;(T;) = [a xX) | (any if (X)/n Ojut [ow (Ex)) 
at ee 0 
~i fa XJ (AX )uy [ow ("x) ax 


=n 0), (7.54) 


from which we obtain 
Q;(A) = inu', (A) = ,(A). (7.55) 


This explains the need for minus the Lie—Poisson bracket, since instead of (3.99) we 
now have (7.37), so that (5.160) gives the exact result (7.8) for f = A and g= B: 


[On(A), On(B)] = On( {A,B}. (7.56) 


The minus sign in the Lie—Poisson bracket could have been avoided by writing 
f(—A/h) in (7.48), whose minus sign would have propagated into (5.159) and hence 
in the commutation relations (5.160), but the latter are so engrained in the physics 
literature that we see the minus sign on the bracket in (7.56) as the lesser evil. 

Any continuous unitary representation u, of G (where A is some label) induces a 
representation ul of C2(G) by (5.173), which may be extended to a representation 
of C*(G) by continuity (the same is true for Cf (G) provided ug is weakly contained 
in L?(G), cf. §C.18). This gives operators u/ (Qp(f)) which, by the same formal 
computation as for the case u = uz above, for A € g rigorously give rise to operators 


(A) = ifw' (A), (7.57) 


satisfying the like of (5.160) for fixed values of f (but without control over the limit 
h — 0). Many commutation relations in quantum mechanics take this form, where 
both irreducible and reducible representations u give rise to interesting examples. 
The reducible case typically comes from group actions and is best studied using the 
formalism of action groupoids reviewed in the next section, where we will see that 
further operators start playing a role. The irreducible case, on the other hand, gives 
rise to intriguing new examples of continuous bundles of C*-algebras, where h (now 
related the label 1) takes values in a discrete set and may be sent to zero, cf. §8.1. 
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7.3 Quantization and external symmetry 


We now generalize the setting of the preceding section from groups taken by them- 
selves to group actions. Let a Lie group G act smoothly on some manifold Q; for 
example, we may have Q = R? with either G = SO(3) acting by rotations, or G = R? 
action by translations. We now take X = g* x Q. Recalling the notation (3.71) and 
writing 6, = 67,, we define the action Poisson bracket 


= peg OF 28 dg Of 
{fg} iz awe H 06, + bah 56. 70, 8 (7.58) 


Interesting special cases arise if we take A € g and define A € C” (g*) as before, i.e., 
A(@) = @(A), now regarded as a function on g* x Q (ignoring the second argument 
q). Similarly, if f € C°(Q) we write f for the corresponding function on g* x Q 
(ignoring the first argument 0). This gives the coordinate-independent expressions 


{A,B} = —[A,B); (7.59) 
{A, f} = sf; (7.60) 
{f,g} =0. (7.61) 


Clearly, if Q is a point (with trivial G-action) we recover (minus) the Lie—Poisson 
structure on g*. If, on the other hand, Q = R? and G = R? acts on Q by translation, 
l.e., a-X =xX-+a, we recover the canonical Poisson bracket (3.34), where the mo- 
menta pg (a= 1,...,n) are identified with the coordinates 0, on the dual of the Lie 
algebra of R?, which is just R? itself (with the usual basis (e),e2,¢3)). Therefore, 
the Poisson bracket (3.34) on R2” may be generalized in two ways: 


1. By passing to arbitrary cotangent bundles T*M, whose canonical Poisson bracket 
is still given in local coordinates by (3.34), which emphasizes the role of mo- 
menta as fiber coordinates on T*M. 

2. By passing to the setting discussed here, which emphasizes the role of momenta 
as generators of global translations of the base space R? (a property that breaks 
the p-g symmetry and cannot be generalized to arbitrary cotangent bundles). 


A richer structure emerges if we keep Q = R? but now take G = E(3), ie., 
E(3) =SO(3) « R’, (7.62) 


known as the Euclidean group. To explain its group structure, let some group L act 
on a vectors space V, seen as an abelian group under addition. Then the operations 


(A,v)-(A',v’) = (AA v+a-v): (7.63) 
(Av)! = (a7!,-Aa71-v), (7.64) 


turn G = L x V into a group, called the semi-direct product of L and V. 


7.3 Quantization and external symmetry 257 


Then F(3) acts on R? in the obvious way, giving rise to the Poisson manifold 
g* x Q=R? x R? x R3 (since s0(3) = R). We now also have generators (J; ,J2,J3) 
of the Lie algebra of SO(3), with corresponding functions J;, as well as standard 
coordinate functions (q1,q2,q3) on Q = R’, giving rise to the Poisson brackets 


{fi Fj} = —eijehes (Fi, pj} = —€inpis (pi, pj} =; (7.65) 
{Ji,gj} = —&ijeges {pi,aj} = 5; {ai,qj} =. (7.66) 


The appropriate target C*-algebra C*(G,Q) for quantization is a generalization 
of C*(G), constructed in a similar way, as explained in §C.18. For the moment it is 
enough to know that C7(G,Q) is the completion of the function space C?(G x Q), 
seen as a *-algebra in the operations (C.526) - (C.527), in a suitable norm, namely 


IIfll- = OIL (7.67) 


where the representation 6 : C?(G x Q) > B(L?(G x Q)) is given by (C.530). In 
case that Q has a G-invariant measure v (still with support Q), the operator 


w:L?(Gx Q) > L’(Gx Q); (7.68) 
wy(x,q) = w(x,x'q), (7.69) 


is unitary, and in terms of the notation 
ai(y) =wuly)w", #(f) =wa(f)w", P(f) =we(f)w", (7.70) 
the formulae (C.528) - (C.530) take the slightly more appealing form 


lily) W(x,q) = ae y'q); (7.71) 
t(f)y . i q)W(x,q); (7.72) 
p(f)w ar ie y,q)w(y!x,y71q). (7.73) 


The simplification thus gained especially concerns the position functions (7.72). 
Analogously to (7.49), the quanitzation maps are given by 


On : Cola" x 0) + C}(G,0): 174) 
de; 
On(NAa) = | aap oO" 1O.e- #4), (7.75) 


where, as in the pure group case, strictly speaking f must lie in the dense subspace 
of Co(g* x Q) consisting of Paley—Wiener functions (in A) that are the Fourier trans- 
form (in the first argument) of functions that lie in C?(g x Q). 

Computations similar to (7.54) then establish, for A € g and f € C*(Q) as before, 


Qy(A) = ihi'(A); (7.76) 
On(f) = Hf). (7.77) 
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Form these formulae and (7.59) - (7.60), it is easy to verify that Dirac’s exact con- 
dition (7.8) holds in the following special cases: 


= [0n(A), On(B)] = On({4.8}): 1.18) 
“10n(A),On(A)) = On({A.f}): (7.79) 
+ [On(f),On(@)] = On({F,8}) =0. (7.80) 


These might be regarded as infinitesimal versions of the covariance condition 
(C.514), specialized to the case at hand. We formalize this special case as follows. 


Definition 7.2. Let G be a locally compact group and let Q be a space equipped 
with some continuous G-action. A system of imprimitivity (u(G),2(Co(Q))) for 
the given group action G © Q is a combination of a strongly continuous unitary 
representation u of G and a nondegenerate representation 1 of Co(Q), both defined 
on the same Hilbert space, that for each x € Gand f € Co(Q) satisfies 


u(x)m(fyu(x)* = m(Lsf). (7.81) 


Here Lf (q) = f(x7'q), as usual. We recall from §C.18 that such systems of 
imprimitivity bijectively correspond to degenerate representations p = 7 ™ ul of 
C*(G, Q) through (C.515), which in the special case (C.524) - (C.525) comes down 
to 


p(s) = f axm(fle,-))u(a). (7.82) 


The formulae (7.71) - (7.73) define such a system of imprimitivity on the Hilbert 
space H = L?(G x Q). However, this cannot be the end result of quantization, since 
this space is typically reducible under the pair (u(G),2(Co(Q))), or, equivalently, 
under p(C*(G,Q)). For example, this is the case for G = R? or G = E(3) acting on 
Q = R? in the natural way discussed above, for which we obtain H = L?(R? x R?) 
or even H = L*(E(3) x R3). In the former case we do obtain the correct posi- 
tion operators gq‘, but for the momentum operators we find the curious expression 
—ih(d/dx' + d/dq')—to their credit, these do satisfy the canonical commutation 
relations (7.5), since these follow from (7.78) - (7.80), which in turn follow from 
the covariance condition (7.81) defining a system of imprimitivity. 

Instead, we would prefer the Hilbert space H = L” (IR*) expected from elementary 
quantum mechanics (without spin), equipped with the system of imprimitivity 


u(y) w(q) = wo"); (7.83) 


m(f)w(q) = f(g) wa). (7.84) 


The answer lies in the search for irreducible systems of imprimitivity (u(G),(Co(Q))), 
or, equivalently, irreducible representations of p(C*(G,Q)); see §7.5. 
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7.4 Intermezzo: The Big Picture 


First, however, we summarize and generalize the results in this chapter so far 

into what we call The Big Picture. This arose in the 1990s from efforts to relate 

Mackey’s quantization theory based on systems of imprimitivity (which Mackey 

himself saw as the natural implementation of what he called Weyl’s Program, i.e. 

the construction of the basic operators of quantum mechanics from group-theoretical 

considerations) to deformation quantization (and hence to the tradition started by 

Dirac, as continued by Groenewold, Moyal, Berezin, Flato, Rieffel, and others). 
The Big Picture is technically based on the theory of Lie groupoids (already 

alluded to in the preceding sections) and Lie algebroids. For a precise definition of 

the former we refer to Definition C.115; briefly, a groupoid G is an object like a 

group, where however multiplication is defined only partially (although the inverse 

is defined for each element). To see which elements can be multiplied, one has maps 

s,t : Gj — Go from the total space G, of the groupoid to its base space Go, such 

that the product xy € G, of x,y € G; is defined whenever s(x) = t(y), and satisfies 

s(xy) = s(y), t(xy) =t(x), and s(x!) = +(x). Four relevant examples are: 

e Spaces, where G; = Go = Q for some set Q, with s(x) = t(x) = x for all x € Gy, 
and hence xy is defined iff y = x, with result xx = x; furthermore, xl=x, 

e Groups, where Gj = G and Go = {e}, with s(x) =t(x) =e for all x, so that all 
elements can be multiplied and the notion of a groupoid reduces to a group. 

e Pair groupoids over a set QO have base space Go = Q, total space G; = Q x Q, and 
projections s(q,q') =q’ and t(q,q') = 4, so that (q,q’)(r,/’) is defined iff q’ =r, 
resulting in (q,q')(q',r’) = (q,r’). The inverse is given by (q,q’)~! = (q',q). 

e Action groupoids (also called semi-direct product groupoids) are important in 
what follows. These originate in some group action we denote by G © Q, where 
G is a group and Q is a set. The ensuing groupoid is called I = G x Q, where 


Ti =GxQ, h=9Q, s(x,q) =x 'g, t(x,9q) =4, (7.85) 


so that products (x,q)(y,q’) are defined iff g’ = x~!g, with result 


(x,4)(9,x"'g) = (ay, 4). (7.86) 
Finally, the inverse is (necessarily) given by 


! mri (7.87) 


(x,q) = (1x 
A Lie groupoid is a groupoid G where G; and Go are manifolds and all operations 
are smooth. In all examples just given this requires Q to be a manifold, and in the 
last one G should be a Lie group, and the given action G x Q —> Q must be smooth. 

Generalizing the construction of a Lie algebra g from a given Lie group G, a Lie 
groupoid comes with an associated linearized (or “infinitesimal’) structure, called 
a Lie algebroid. As in the group case, this differential-geometric notion can also be 


defined independently of its origin in the theory of Lie groupoids, as follows: 
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Definition 7.3. A Lie algebroid E over a manifold Q is a vector bundle E +, QO with 
a vector bundle map E 4, TQ (called the anchor), as well as with a Lie bracket [, | 
on the space C*(Q,E) of smooth cross-sections of E, satisfying the Leibniz rule 


[01, f 02] = f- [01,02] + (oof): (7.88) 


for all 01,02 € C*(Q,E) and f € C*(Q) (here 0 0; is a vector field on Q). 


It follows that the map 0 +> @oo0:C*(Q,E) — C*(Q,TQ) induced by the anchor 
is a homomorphism of Lie algebras, where the latter is equipped with the usual 
commutator of vector fields (this homomorphism property used to be part of the 
definition of a Lie algebroid, but in fact it follows from the stated definition). 

Lie algebroids generalize (finite-dimensional) Lie algebras as well as tangent 
bundles, and the (infinite-dimensional) Lie algebra C*(Q, E) could be said to be of 
geometric origin in the sense that it derives from an underlying finite-dimensional 
geometrical object. Similar to the above list of examples of Lie groupoids, one has 
the following basic classes of Lie algebroids. 


e Manifolds, where E = Q, seen as the zero-dimensional vector bundle over Q, 
evidently with identically vanishing Lie bracket and anchor. 

e Lie algebras, where E = g and Q is a point (which may be identified with the 
identity element of any Lie group with Lie algebra g) and anchor a = 0. 

e Tangent bundles over a manifold QO, where E = TQ and a =id: TQ > TQ, with 
the Lie bracket given by the usual commutator of vector fields (or derivations). 

e Action algebroids (or semi-direct product algebroids) are defined by a g-action 
on a manifold Q, i.e. a Lie algebra homomorphism g + C*(Q,TQ), At> 64, 
where we identify vector fields on Q with derivations on C*(Q)—these are often, 
but not necessarily, obtained from a G-action on Q via see (3.71). We write E = 
gx Q, which is E = g x Q as a trivial bundle (with z the projection on the second 
space), and @&(A,q) = —64(q) € T,Q, where A € g. The Lie bracket is given by 


(01, 02](q) = [01(4), 02(9)]g + 50,01 (¢) — 5c, 02(). (7.89) 


These examples may also be recovered as special cases of the following construction 
that canonically associates a Lie algebroid Lie(G) to a Lie groupoid G: as a vector 
bundle, Lie(G) is the restriction of ker(t,) to Go (where t, : TG; — TGo is the 
derivative map of the source projection tf : G; —+ Go), and the anchor is @ = s,, (one 
may alternatively define Lie(G) as the normal bundle to the object inclusion map 
i: Go © G, cf. Definition C.115, but this makes the definition of the anchor a bit 
more complicated). As in the Lie group case, one may identify sections of Lie(G) 
with left-invariant vector fields on G, and under this identification the Lie bracket 
on C*(Go,Lie(G)) is by definition given by the commutator of vector fields. 
Conversely, one may ask whether a given Lie algebroid E is integrable, in that 
E = Lie(G) for some Lie groupoid G (where the isomorphism sign ~ means that 
a pertinent vector bundle isomorphism E = ker(t,.)|g, should preserve all relevant 
structure). Unlike the special case of Lie groups (where Lie’s Third Theorem 5.41 
settles this in the positive), this is not necessarily the case, but that is of no concern. 
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We now state a crucial connection between Lie algebroids and Poisson geometry. 


Proposition 7.4. The dual vector bundle E* of a Lie algebroid E is a Poisson man- 
ifold, whose Poisson bracket on C*(E*) is defined by the following special cases: 


{f,g} =0 (f,g €C°(Q)); (7.90) 
{6,f} =—acof (os €C°(Q,E), f eC°(Q)): (7.91) 
{61,62} = —[o1, 0], (7.92) 


where 6 € C*(E*) is defined by a given section 6 of E through the obvious pairing. 

Conversely, if the dual F* to a given vector bundle F — Q is a Poisson manifold 
such that the Poisson bracket of two linear functions is linear, then F = E for some 
Lie algebroid E over Q, with the above Poisson structure on E*. 


Following our earlier lists, the main examples are: 


e A manifold Q, seen as the dual to the zero-dimensional vector bundle Q > Q, 
carries the zero Poisson structure. 

e The dual g* of a Lie algebra g acquires (minus) the Lie—Poisson structure (3.98). 

e A cotangent bundle T*Q acquires (minus) the Poisson structure defined by its 
standard symlectic structure, cf. (3.34). 

e The dual g* x Q of an action algebroid acquires the Poisson bracket (7.58). 


The following theorem displays a rich and physically relevant class of examples 
of Definition 7.1 of deformation quantization. The key point is that a Lie groupoid 
G defines both classical and quantum data, namely the (reduced) Lie groupoid C*- 
algebra C/, (G) (cf. §C.17) and the Poisson manifold Lie(G)* (cf. Proposition 7.4), 
and these are continuously (even smoothly) related through the tangent groupoid 
G' (cf. Proposition C.117) and its associated Lie groupoid C*-algebra Cir) (G’). 


Theorem 7.5. For any Lie groupoid G, the bundle of C*-algebras given by 


Ao = Co(Lie(G)*) (h = 0); (7.93) 
An = C*(G) (0<h<1); (7.94) 
A=C*(G"), (7.95) 


defines a deformation quantization of the Poisson manifold Lie(G)* over I = [0,1]. 
The same statement holds for the corresponding reduced groupoid C*-algebras. 


The key lemma for this theorem is Theorem C.123, which provides the continuity of 
the given bundle of C*-algebras. A lengthy computation shows that also the Dirac— 
Groenewold-Rieffel condition (7.14) is met. In this light, the quantization of the 
phase space 7*R” in §7.1 then corresponds to the pair groupoid G = R” x R” on R", 
the one in §7.2 follows from the special case where the Lie groupoid G is “simply” 
a Lie group, and the case of §7.3, which puts Mackey’s quantization theory in a 
deformation framework, is obviously given by the action groupoid G x Q. Finally, 
the space groupoid Go = G; = Q gives a trivial continuous bundle of C*-algebras, 
where Ay, = Co(Q) for all f € [0,1], and Q carries the zero Poisson bracket. 
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7.5 Induced representations and the imprimitivity theorem 


Returning to 87.3, we recall the bijective correspondence between systems of im- 
primitivity (u(G),2(Co(Q))) and non-degenerate representations of the C*-algebra 
C*(G, Q) of the action groupoid defined by the given action G © Q. This correspon- 
dence preserves irreducibility, and our task is to find irreducible representations. 

It was recognized at least 50 years ago that this task can be carried out if the 
group action satisfies a certain regularity condition, and is hopeless otherwise. This 
is sometimes called the Mackey—Glimm dichotomy. The condition in question may 
be stated in a number of equivalent ways (whose equivalence is not at all obvious). 

First, we recall some terminology from topology. Let X be a space. One calls 
Y CY’ CX relatively open in Y' if there is an open set U C X such that Y = Y'NU. 
A subset Y C X is locally closed if each y € Y has an open neighbourhood U in X 
such that UY is closed, and finally “X is 7o” if for any two distinct points there 
is an open set that contains exactly one of them. Furthermore, each g € Q defines a 
G-orbit through g denoted by G-g, as well as a stabilizer (or “little group”) 


Gg = {xE€G|x-q=q}. (7.96) 
For any subgroup H C G, we denote the equivalence class of x in G/H by [x]. 


Definition 7.6. A smooth action of a Lie group G on a manifold Q is called regular 
if one and hence each of the following equivalent conditions is satisfied: 


1. Each G-orbit in Q is relatively open in its closure; 

2. Each G-orbit in Q is locally closed; 

3. The quotient space Q/G of G-orbits in Q is Tp; 

4, Each map [x] > xq is a homeomorphism from G/Gz to the orbit G-q (q € Q). 


Probably the simplest example of a non-regular action is the action Z © T given by 
nizey ermine, (7.97) 


where 6 € R\Q (here Z may be seen as a zero-dimensional Lie group with infinitely 
many components—in fact, Definition 7.6 more generally applies to second count- 
able locally compact groups and spaces that are “almost Hausdorff’). Indeed, each 
orbit is dense in T (but not open), and the orbit space T/Z has no proper open sets. 


Theorem 7.7. Let a group action G © Q be regular. Then the irreducible represen- 
tations of the associated action groupoid C*-algebra C* (G,Q)—and hence also the 
irreducible systems of imprimitivity (u(G),%(Co(Q)))—are classified up to unitary 
equivalence by pairs (uy), where © is a G-orbit in Q and uy is an irreducible 
representation of the stabilizer Gg of an arbitrary point q € @, with an explicit 
construction of the corresponding representation P(6y,)(C*(G,Q)). Two such rep- 
resentations P(@,y,) and Pro" w,) We equivalent iff C = 0” and, given that q' = xq 
and hence Gy = xGyx! for some x € G, Uy is unitarily equivalent to uy o Ad(x). 
Finally, any irreducible representation p is unitarily equivalent to some P(e u,)- 
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In the simplest case, Q is equal to a point, so that C*(G,Q) = C*(G), and we find 
that irreducible representations of C*(G) (which are necessarily non-degenerate) 
bijectively correspond to unitary irreducible representations of G. In the next easiest 
case, G acts nontrivially but still transitively on Q, in which case the action is clearly 
regular and Q  G/H through the G-equivariant map in no. 4 of the above definition 
(read in the opposite direction), i.e., we pick some go € Q, define H = G,,, and 
finally map Q to G/H by q+> |x], where g = xqo (this map is well defined); in that 
case, we might as well have assumed that Q = G/H to begin with. The following 
important corollary of Theorem 7.7 is called the Imprimitivity Theorem. 


Corollary 7.8. Up to unitary equivalence, irreducible representations of C*(G,G/H) 
(or, equivalently, of pairs (%1(Co(G/H)),u(G)) satisfying the covariance condition 
(7.81)) bijectively correspond to unitary irreducible representations of H. 


In preparation for the general case stated in Theorem 7.7, and also as a goal in 
itself, we first give an explicit construction of the irreducible representation p* 
of C*(G,G/H) corresponding to a given unitary irreducible representation uy (H), 
where we label the unitary irreducible representations of H (up to unitary equiva- 
lence) by 7 € A (where A is the set of unitary equivalence classes of unitary ir- 
reducible representations of H, cf. §C.15 for the abelian case), and let the corre- 
sponding representation p* (C*(G,G/H))—or the pair 1% (Co(G/H)) and u* (G)— 
inherit this label (in raised form, in order to prevent confusion between uy(H) and 
wt (G)\). 

The construction of p* (C*(G,G/H))—or, equivalently, of a system of imprim- 
itivity (7% (Co(G/H)),u* (G))—from uy (H) proceeds by the technique of induced 
representations (which physicists may be familiar with from the representation the- 
ory of the Poincaré group, see Theorem 7.9 below). We start from a specific realiza- 
tion of uy(H) on a Hilbert space Hy (which is finite-dimensional if H is compact or 
abelian). From this, we construct a new Hilbert space H*, whose realization depends 
on the choice of a quasi-invariant measure v on G/H, i.e., a (non-zero) measure 
whose null-sets are G-invariant in the sense that if v(A) = 0 for some (Borel) mea- 
surable A C G/H, then also v(x-A) = 0 for each x € G. This will surely be the 
case if Vv is invariant, i.e., if v(x-A) = v(A) for each measurable A, but invariant 
measures on G/H may not exist, whereas quasi-invariant measures always do. 

We now consider (measurable) functions y : G > Hy that satisfy 


w(xh) = uy(h')y(x), (7.98) 
for every x € Gand h € H; equivalently, we may say that 
uz(h) RWW = V, (7.99) 


for each h € H, where R;, W(x) = w(xh). Now if w and @ both satisfy (7.98), then, 
by unitarity of wy, their inner product (P(x), ¥(x))a, in Hy is H-invariant, in that 


(Pp (xh), W(xh)) ty = (P(X), WR) Hy (7.100) 
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Hence the function x ++ (@(x),y(x))x,, @ priori defined from G to C, induces 
a function [x] ++ (p(x), Y(x))4, from G/H to C. We write the latter function as 


(9, W) iy [x]; in particular, taking @ = y, we write || y||Z, [x] = (W(x), WX) H,. We 
may then define a new Hilbert space H* that consists of all measurable functions 
yw: G— Hy, that for each h € H satisfy (7.98), and are square-integrable on G/H: 


[av liv bl <=. (7.101 
G/H 
This space turns out to be complete in the natural inner product 


(9.W) =f av(Lx}) (PW) ay [I (7.102) 

It also carries a system of imprimitivity: in case that v is G-invariant we simply have 
u*(y)y(x) = wy 'x) (ny EG); (7.103) 

a (fw(x) = f(b) w(x) (f € Co(G/A)), (7.104) 


where we note that u*(y)y satisfies (7.98) if y does. Unitarity of u* as well as the 
covariance condition (7.81) are easily checked. In general, we replace (7.103) by 


u(y) w(x) = 4] vO) (7.105) 


where dv([y~!-]) /dv([-]) is the Radon—-Nikodym derivative of the translated mea- 
sure Lv with respect to V, cf. (B.137), which is well defined because by the assump- 
tion of quasi-invariance, Ly, Vv is absolutely continuous with respect to v (indeed, on 
this assumption they are even equivalent). Here L}v(A) = v(Ly '(A)),A CG/H. 
Physicists do not like the Hilbert space H%, preferring a different realization 


H% — 17(G/H) @Hy, (7.106) 


in which the wave-function y is not constrained and one has a clean separation 
between the (typically) spatial degree of freedom Q = G/H and the internal degree 
of freedom Hy. One half of the system of imprimitivity will then be given nicely by 


a (fw = fw (f €Co(G/H)), (7.107) 


but this cleanliness comes at the cost of a more complicated formula for i (y), as 
follows. Pick a (measurable) cross-section s : G/H — G, i.e., a right inverse to the 
projection p : G— G/H, p(x) = [x], in other words, we have 


pos =idg/q. (7.108) 
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It may not be possible to make s continuous, and, crucially, s is not a Jeft inverse to 
p; instead, there exists a unique function h; : G > H such that so p(x) = xh,(x), ie., 


hs(x) =x7's([x]). (7.109) 


Such a cross-section s gives rise to a unitary isomorphism 


Ws: H* + H*; (7.110) 
wsw(q) = y(s(q)): (7.111) 
we W(x) = uy(hs(x)) W(x), (7.112) 


which enables us to move the system of imprimitivity (u*,2*) to H% by defining 


suX(y)wy (y € G); (7.113) 
aX (f)w* (f € Co(G/H)). (7.114) 


I 


a*(y) =w 
f) =w 


#(f) = 


This duly leads to (7.107), but instead of (7.105), we obtain the more cumbersome 


a (y)W(q) = [| — uy (s(q) ys !@)) WO"9), (7.115) 


where of course the square root may be omitted if v is G-invariant, as in (7.103). 
The argument h = s(q)~!ys(y~'q) of uy appearing here is called the Wigner cocycle 
(after the physicist who first introduced it in his classification of the irreducible 
representations of the Poincaré group). One may verify that h € H by applying p, 
which by construction is G-equivariant (i.e., p(xy) = xp(y)), which gives 


p(h) = p(s(q) ‘ys 'g)) = (4) 'yp(s(y'q)) = 8(@) yy 1 =s(@) ‘4, 


where in the third step we used (7.108). For any x € G we have x7! [x] = [x~!x] = [e], 
so taking x = s(q) in this computation we find p(h) = [e], which is true iff h € H. 

Given an irreducible system of imprimitivity (#,#7*), we obtain generalized 
momentum operators by passing to the associated representation of the Lie algebra 
g of G through (5.156) and (7.57), ie., 


f(A) = ih(a@*)'(A), (7.116) 


where A € g, so that, cf. (7.78) - (7.80), we obtain from (5.160) and (7.81): 


[7% (A), #* (B)] = init” ({A, B]); (7.117) 
[#* (A), #* (f)] = iht* (ds f); (7.118) 
[#* (f), #* (g)] = 0, (7.119) 


where A,B € g and f,% € Co(Q) (in fact, these formulae—defined on the right 
domain—work also for many unbounded functions on Q, see below), and 6, is 
defined in (3.71). Let us take a look at a few illustrative special cases: 
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e IfH =G, then Qisa point, so that C*(G,Q) = G*(G), and systems of imprimitiv- 
ity are just irreducible representations of G. We have H* = Hy through the map 
w:H% — Hy defined by y+> w(e) = w’ € Ay, with inverse w(x) =uy(x!)y’. 
This gives wu (y)w—! = uy(y). Similarly, in (7.115) we take s = e, which gives 
i (y) = uy(y) on H* = Hy. 

e If H = {e} we have Q = Gand C*(G,G) & Bo(L7(G)), which quantizes the un- 
derlying classical phase space g* x G & T*G. We now have H = L?(G) carrying 
the left-regular representation of G. 

e Let G=E(3) act canonically on Q = R*. Taking qo = 0 gives H = SO(3), so irre- 
ducible systems of imprimitivity are classified by j =0,1,..., with corresponding 
irreducible representations D ;(SO(3)) on H; = C7/*!, cf. §5.8. Hence 


H/ = L’(R3) QA, (7.120) 


and using the cross-section s(q) = (13,q) from R? to E(3) we obtain, from 
(7.115) with (7.63) - (7.64) and (7.107), the expressions 


i!(R,a))W(q) = Dj(R)W(R (qa); (7.121) 
i! (f))W(q) = f(a) (q). (7122) 


For j = 0 this gives the usual quantum theory of a spinless particle: 


1. The Hilbert space is H° = L(R3). 
2. For the generators of R* C E(3) we duly obtain the momentum operators 


0 


P= -ihs, 
L Ogi 


(7.123) 
where P; = #°(e;) is defined in terms of the standard basis (e1,e2,e3) of R°, 
now seen as the Lie algebra of R°. 


3. Using the basis (3.66) of the Lie algebra of SO(3) C E(3), we obtain the 
orbital angular momentum operators (which pick up extra terms for j > 0): 


r) r) 

~0 eter 3 ery : 

m (Ji) = ih (« apt xa): (7.124) 
r) r) 

=O a 1 228 , 

(Jz) = ih (1 ae i! st): (7.125) 
r) r) 

~0 = 5 2 Set 

H (Jz) = ih (1 ram sa) (7.126) 


4. The coordinate functions f(q) = gq’ yield the position operators Q; = #°(q'): 


O:W(q) =4'W(q). (7.127) 


5. Thus we obtain all the familiar commutation relations like [Q;,P;] = ihd;;, 
[2° (Ji), #°(J2)] = ih#° (Js), etc., cf. (7.65) - (7.66). 
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e Let G=R act on Q =T, which we parametrize by z = exp(27iq), g € [0,1), by 
a: exp(27ig) + exp(2mi(q+a)), (7.128) 


so that H = Z, with H = T under u,(n) = 2", z€ T, n € Z, cf. (C349). We 
parametrize H by z=exp(i@), 6 € (0,27), so that (with slight abuse of notation) 
ug(n) = e'”®. In the second description (i.e. the one of the physicists) we have 


H® =17(T) =L(0,1), (7.129) 
where topology of Q is lost for the moment. Using the cross-section 
het) a (7.130) 


where q € (0,1), we obtain 


i® (a) W(q) =e" 99 W(q—a+n(a,q)), (7.131) 


where n(a,q) € Z is the unique integer such that gq —a-+n(a,q) € [0,1). The 
corresponding momentum operator is formally given by the usual expression 
P = —ihd /dq, cf. (7.123), which appears to be independent of 0 (since for any 
q € (0,1) and a small enough we have n(a,q) = 0), but in fact the @-dependence 
is in its domain, which can be shown to consist of the subspace of the Sobolev 
space H!(0,1)—i.e. the closure of C*({0, 1}) in the inner product (5.318) adapted 
to L?(0,1), which implies H!(0, 1) C C((0, 1])—whose elements satisfy 


yw(1) =e ?y(0). (7.132) 


To see this, we recall that 


meee) (7.133) 


Pw =ihii 
a tim ( E 


where the limit is taken in the L2-norm, so that we need existence of 
1 
lime? | dglen?y(q—e+n(e,4))— WP 
€>0 0 
For 0 < q < € we have n(€,q) = 1, whereas for € < q < 1 we have n(e€,q) = 0, 
so it is convenient to split the integral as a sum of fj and fo The second term 


enforces the existence of derivatives in the L?-sense (which in turn makes ~ 
continuous on (0, 1]) and is unproblematic, but the first requires the existence of 


E 2 
lim er} dq |e W(q—e€+1)— Wg)’. 
€>0 0 


This strange expression, then, enforces the boundary condition (7.132). In this 
case there is no single position operator, but the algebra C(T) plays its role. 
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7.6 Representations of semi-direct products 


The case Q = G/H also provides the key for the general case, as long as the G-action 
on Qis regular, cf. Theorem 7.7. In that case, the construction of the irreducible sys- 
tem of imprimitivity (u(G),2(Co(Q))) corresponding to a pair (O,uy(H)), where 
@ is a G-orbit in Q, requires no new ideas: we have G ~ G/H, and hence u = u* 
and 2 = m* as described in §7.5 (where the function f in formulae like (7.104) or 
(7.114), which in these expression was defined on G/H, should be seen as the re- 
striction of f € Co(Q) to G@ C Q). An important application of this construction is 
the representation theory of regular semi-direct products L x V (cf. 87.3), where 
regularity means that the dual L-action on V* is regular; this action is given by 


A-0(v) =0(A!-v) (AEL,OEV*,VEV). (7.134) 


Theorem 7.9. Up to unitary equivalence, the irreducible unitary representations of 
a regular semi-direct product G = L « V are classified by pairs (C0), where C is 
an L-orbit in V* and o is an element of the unitary dual of the stabilizer Lo C L of an 
arbitrary point 09 € G. The corresponding representation i(O.0) (G) may be realized 
from an irreducible representation Ug of Lo on a Hilbert space Hg combined with a 
cross-section s : L/Lo — Lof the canonical projection p: L + L/Lo, namely through 


A) — 1? (L/L) @Ho: (7.135) 
a) (Av) G0) = ug (s(0)-!As(A-1@))W(A-10). (7.136) 
Proof. Let u be a unitary representation of G. This implies 

u(A)u(v)u(A—!) =u(A-v), (7.137) 
in which A = (A,0) and v= (e,v). Since V C Gis abelian, we have C*(V) = Co(V*) 
by the Fourier transform (cf. Theorem C.109 in 8C.15), which here is given by 


(7.44) - (7.45), with A ~» v. Hence the representation w/ (C*(V)) defined by u(V) 
via (5.172), seen as a representation of Co(V*) via the Fourier transform, is given 


by 
u! (f) = (2m) L 7 dvd" 0c) £(@)u(v). (7.138) 


Using invariance of the measure d"vd"0 under the joint transformation (v,0) ~» 
(A -v,A -@), from (7.137) we obtain, for f € Co(V*) in the image of f € C?(V), 


nae! Puayr= Qn)" | dvd" c) ¢(8)u(A-v) 
=n)" [ dtvd"@e@ 94 FAA -B)u(A-v) 


=(2ny" | d’va"6 e) f(A! O)u(v) 
Vxv* 
=u! (L;f). (7.139) 
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Consequently, a unitary representation u(L x V) defines a system of imprimitivity 
(u(L),u! (Co(V*))), and vice versa, since any pair of representations (u(L),u(V)) 
that satisfies (7.137) gives rise to a representation u(G) by u(A,v) =u(v)u(A). 

Now apply Theorem 7.7 with G ~» L and Q ~» V*. All we need in order to obtain 
(7.135) - (7.136) from (7.106) and (7.107) - (7.115) is to find the representation 
u(V) that induces the representation uw! (Co(V*)) given by (7.107), namely 


u(v) (0) =e OM wa), (7.140) 


as is easily checked from (7.138). 


In view of this, we have a remarkable group—groupoid C*-algebra isomorphism 
C(LKV) =C*(LKV*), (7.141) 


where the left-hand side is just the C*-algebra of the group L « V, whereas the right- 
hand side is the C*-algebra of the action groupoid L x V* relative to (7.134). Also, 
a computation shows that the same formulae (7.135) - (7.136) are obtained if, given 
0 € V* and hence given Lo as its stabilizer, we define a subgroup H C G by 


H=LpxV, (7.142) 
and induce from the representation ug, 5) of H defined by 
(6,0) (AsV) = ug (A). (7.143) 


We briefly discuss four basic examples from physics, each of which is easily seen 
to be regular. We write a instead of v in (A,v) € G so as to emphasize the “spatial” 
character of V, whereas V* is labeled by a dual “momentum” variable p. 


e G=E(2) =SO(2) x R’, defined like E(3), i.e., with respect to the usual action 
of SO(2) on R? (this group will play a role in the representation theory of the 
Poincaré-group). We find the same action of SO(2) on (IR?)* = R?, so that the 
orbits are Gy) = {0} with Go = SO(2) and G, = {(x,y) € R? | x? +y? = 77} for 
r > 0, with G, = {e}. Thus the Hilbert spaces and representations are given by 


AO”) =C; (7.144) 

#") (A,a) = erin, (7.145) 
H’ = 1’ (0,1); (7.146) 

ii’ (A,a) W(p) = ei 1087’ +425i0P) wy — A|mod 1), (7.147) 


where n € Z, A € [0,1), p € (0,1), and p’ = 2p. In the first case R? C E(2) is 
represented trivially, whereas in the second the r-dependence of the representa- 
tion lies entirely in R? (since H’ and i”(A,0) are evidently independent of r). 
The projective representations of G are of considerable interest, too, cf. §5.10. 


Lemma 7.10. /f G = SO(p,q) « R?*4 (p > 0,q > 0), then H*(g,R) =0. 
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Here SO(p,q) is the subgroup of SLp+¢(R?’*%) whose elements leave the form 
2.229 2 2 2 

x sg beret xy — Xp tt +X pag) 


invariant; the best-known example is the (proper) Lorentz group SO(3,1), see 
below. This lemma may be proved by a straightforward but lengthy computation. 
By Theorem 5.59, the projective unitary representations of G then correspond to 
the ordinary unitary representations of the universal covering 


G=RxR’, (7.148) 


where R acts on R? through the covering projection # : R > SO(2) = R/Z, cf. 
Theorem 5.41 (with D ~~ Z). This changes the expressions (7.144) - (7.147) into 


AO) —¢; (7.149) 
(0) (A,a) = eish. (7.150) 
#®) = 77(0,1); (7.151) 


where A € R,s ER, 6 € (0,27), p € (0,1), and n(A, p) is defined as in (7.131). 
e G=E(3) =SO(3) x R?, as before with the defining action of SO(3). The SO(3)- 
orbits in (IR*)* = R? are spheres $2 = SO(3)/SO(2) with radius r > 0, as well as 
the origin (r = 0) with stabilizer SO(3), so that for the Hilbert spaces we obtain 


AO/ = C1, (7.153) 
HY = 17(8"); (7.154) 
where j =0,1,... labels the unitary irreducible representations of SO(3) on Hj = 
C?/+!, whereas n € Z labels the irreducible representations of SO(2) on C (we 


write S? = S?). In the second case, the representation u'”) of SO(3) C E(3) 
depends explicitly on n through the Wigner cocycle; for n = 0 we simply obtain 


a” (R,a)H(p) =e"? *H(R |p). (7.155) 


For n 4 0 we just give a formula for ai"”) (R, a) in case that R is a rotation around 
the z-axis and a = 0; this is enough to make the point. To this end we parametrize 
SO(3) by the well-known Euler angles, i-e., in terms of the matrices Jj, cf. (3.66), 


R(o,0, 0) = e392 e%3 (7.156) 
and write q € S? as gq = (6,0) = R(¢,0,0)e3 with e3 = (0,0, 1) (the spherical 
coordinates of g are (@ — 42, )). This also provides S? with an SO(3)-invariant 


measure dv(@,@) =d@d@sin@. A convenient choice of s : S7 + SO(3) is 
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in which case we simply obtain, writing R,(@) = R(@,0,0), 


al") (R(0t),0) (9,8) = eH (g — oF, 8). (7.158) 


The universal covering group of E(3) is 


SI 
— 
1oS) 
WN 

II 


SU(2) x R3, (7.159) 


where SU (2) = SO(3) acts on R? through its covering projection # onto SO(3), 
as in the previous case. By Theorem 5.59 and Lemma 7.10, the projective unitary 
irreducible representations of E (3) are given by the unitary irreducible represen- 
tations of SU(2) x R?. This obviously leads to additional half-integral values for 
j in (7.153), since this number now labels the unitary irreducible representations 
of SU(2). As ton in (7.154), the subgroup H C SU(2) that stabilizes (0,0,r) € S? 
consists of all matrices u, = diag(z,z), where z € T, so H = T and hence H = Z 
under uz ++ 2”, m € Z. We now recall from the proof of Proposition 5.5 that 


u=cos(0/2)-12+isin(@/2)u-o € SU(2), (7.160) 


where u is a unit vector in R°, projects to #(u) = Rg (u) € SO(3), i-e., the rotation 
around u by an angle 0. Parametrizing z = cos(a@/2) +isin(a@/2), a € [0,4z), 
therefore gives #(u,) = exp(aJ3). Besides (7.157), we now also need a cross- 
section s : S2 + SU(2), for which the above analysis suggests we take 


5(9,0) =u) (p)u (8)u (—9); (7.161) 
u)(@) = cos(40)- 12 +isin(40)- 09; (7.162) 
u9)(o) = cos($/2)- 12 +isin( /2)- 03; (7.163) 


note that u, = u?) (a). A calculation similar to the one leading to (7.158) gives 
al") (u;,0) (9,0) =e" (o — a, 8). (7.164) 


Comparing (7.158) and (7.164), we see that if m is even, then n = m/2 (of course, 
by convention we may replace m/2 in (7.164) by on the understanding that n 
may now be half-integral). If m is odd, choosing & = 27 we famously obtain 


a" (—19,0)% = -W. (7.165) 


More generally, if we take a closed path tf ++ Rog;(u), t € [0,1] in SO(3), 
which starts and ends at 13, and lift it (with respect to the covering projection 
f : SU(2) + SO(3)) to a path t+ u(t) = cos(at) +isin(at)u-o in SU(2), 
which now starts at 12 and ends at — 12, then the corresponding representation 
a") (u(t),0) takes the wave-function &% to itself if m is even, whereas it takes 
wW to —W whenever m is odd (this is an embryonic version of the connection 
between spin and statistics, fully realized only in quantum field theory). 


272 7 Limits: Small A 


e G=Lx R**!, the Poincaré group, where the Lorentz group L = O(3, 1) consists 
of all real 4 x 4 matrices that leave the indefinite quadratic form 


ee eee ae ee (7.166) 


invariant; in this context the standard coordinates on R* are labeled as (X0,%1,*2,%3). 
The Lorentz group has four connected components, which may be identified by 
the (independent) conditions det(A) = +1 and +Aoo > 1. For simplicity we re- 


strict ourselves to the connected component Es of the identity, in which det(A) = 
1 and Ago > 1. This group is called the proper orthochronous Lorentz group, 
which in turn defines the proper orthochronous Poincaré group pt = Le x Rt. 
Writing p? = pp — pt — p3 — p3, the L' -orbits in (IR*)* = R* are seen to be: 


1. 6) = {(0,0,0,0)}, with stabilizer (L'.)y = L'; 

2. OF ={p ER | p* =m, £po > 0}, m > 0, with (L')o = SO(3); 

3. OF ={p ER‘ | p? =0,+po > 0}, with (L')o = E(2); 

4. Oim ={p € R¢ | p? = —m?,+po > 0}, m > 0, with (L')o = SO(2, 1). 


Here the stabilizers Lo are found by taking the reference points (+m,0,0,0) in 
case 2, (+1,0,0,—1) in case 3, and (0,0,0,) in case 4. The physically relevant 
cases are probably Cr, and OF . We pass straight to the universal covering group 


Pl =SL(2,C) x R*, (7.167) 


where the covering projection # : SL(2,C) > aN is given analogously to the case 
(5.46). We again start from the four matrices (69, 0) ,02, 03) in (5.42), and note: 


— These form a basis for the (real) vector space of all self-adjoint 2 x 2 matrices; 
— For any x € R* we have det(D iio Xn On) = x? as defined in (7.166); 
For any A € SL(2,C) and a € M2(C) we have det(Aad*) = det(a); 
— Forany A € SL(2,C) and self-adjoint a € Mz(C), Aad* is again self-adjoint. 


Taking a =), Xp On, it follows that for A € SL(2,C) and x € R* there must be 
A € O(3,1) such that AY, xpOpA* = L(A -x) yOu. By continuity and the fact 
that SL(2, C) is connected it follows that in fact A € ui, so we put #(A) =A. As 
for (5.46), the kernel is ker(#) = Zz = {+12}. This enlarges the stabilizers: 


1. For @, we now obtain (E\)o = SU (2), leading to a family of unitary irre- 
ducible representations uw’ labeled by mass m > 0 and spin j = 0, Fi lysens 

2. For Oo the stabilizer (L\)o of (1,0,0,1) is a double cover E(2)’ of E(2), 
whose unitary irreducible representations are labeled by either (0,7) with n € 
Z/2 (called helicity) or by r > 0. The latter case does not occur in nature. 


On the one hand, this classification is a triumph of mathematical physics, but on 
the other hand, it fails to single out which cases actually occur in nature: as far 
as we know, these are spin j = 0 and j = 5 and helicity n = +1 andn = +2. 
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e G=E(3) x R’, the Galilei group, defined via the following E(3)-action on R*: 
(R, Vv) : (ao,a) + (ao,Ra+aov). (7.168) 


Note that v is physically interpreted as a velocity, whereas earlier a € R? C E(3) 
was a position variable. This is clear from the defining G-action on R*, given by 


(R,v,a0,a) : (t,x) > (t+a9,Rx+a-+tv), (7.169) 
which in fact determines the action (7.168). Either way, we obtain the group law 
(R,v,a0,a) -(R’,v’,ap,a’) = (RR’,v + Rv’ ,ap +a),a+Ra’+apv). (7.170) 


We therefore see that the role of the Lorentz group SO(3, 1) is now played by the 
Euclidean group E(3). Since from (7.170) the inverse is found to be 


(R,v,ao,a)' =(R7!,—R7!v,—ao,—R7! (a—aov)), (7.171) 


the dual E(3)-action on (R*)* & R¢ is given (in non-relativistic notation) by 
(R,v) : (E,p) + (E — (v,Rp), Rp). (7.172) 
Hence the dual E(3)-orbits in R* are labeled by E € R and r > 0, as follows: 


On = {(E,0)}; (7.173) 
Ov) = {(E,p),£ €R, |[p|| = 7}. (7.174) 


The representations of G corresponding to the first type are basically the repre- 
sentations of E (3), whereas in the second case the stability group of say (0,0,0,7r) 
is isomorphic to E(2). None of the ensuing induced representations of G re- 
produces some recognizable version of non-relativistic quantum mechanics, for 
which we need to pass to projective representations of G. These may be found 
from Theorem 5.62, which here applies in full glory, since H*(g,R) 40. A 
(lengthy) computation shows that H?(g,IR) has a single generator 


9((M,v,ao,a),(M’,v',ap,a’)) = (v,a’) — (v’,a), (7.175) 


where M € s0(3), and (v,ao,a) € R? x R* C g = 50(3) OR? GR? are identified 
with the corresponding Lie group elements. Following the procedure culminating 
in Theorem 5.62, the central extension G is found to be (cf. (7.159) and (5.46)) 


G=E(3) KR, (7.176) 


where, writing #(u) = R(u), the covering group E(3) acts on R> through 


(u,v) : (ao,a,c) + (ao,R(u)a+aov,c+ Lao||v||? + (v,R(w)a)). (7.177) 
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Consequently, writing ¥ = (R,v,ao,a), for the group law in G we obtain 


(%,c)-(#,c") = (&-¥,c+c' + (v,R(u)a’) + 4a)||vI|7). (7.178) 


Eq. (7.177) implies the following dual E (3)-action on (IR°)* = R°: 


(u,v) : (E,p,m) > (E— (v,R(u)p) + 4ml|v||?,R(u)p — mv,m). (7.179) 
This time, the E(3)-orbits in R° are: 


1. Og ={(E,0,0)} (E € R), with stabilizer E (3); 
2. Ov.) = {(E,p,0) | E ER, ||p|| =r} (r > 0), with stabilizer E(2)’; 
3. Gum={(E,p,m) | E—Ep =U} (mE R\{0}, U € R), with stabilizer SU (2). 


Here E(2)' Cc E(3) is a double cover of E(2), like the subgroup of SL(2,C) 
stabilizing the point (1,0,0,1) € R* in the theory of the Poincaré-group. This 
time we take any point (E,0,0,7,0) € R°, which is stabilized by pairs (u,v) € 
E (3) for which R(u) is a rotation around the z-axis and v = (v1,v2,0); the image 
of these pairs in E(3) is E(2) = SO(2) x R*, where SO(2) C SO(3) consists of 
rotations around the z-axis and R? is the x-y plane. In the third case we write 
Ep = ||p||?/2m and take (U,0,m), whose stabilizer in E(3) is evidently SO(3). 


Thus we have massless as well as massive particles both in relativistic and in non- 
relativistic quantum physics. The simplest case of all is formed by massive non- 
relativistic particles, which correspond to the orbits Gy» above, supplemented with 
a spin j labelling the underlying irreducible representation D ; of SU (2). Such orbits 
are diffeomorphic to R? under the identification (U +Ep,p,m) © p, and a conve- 


nient choice of the cross-section s : Gym — E(3) is s(p) = (12,—p/m), since in 
that case the Wigner cocycle simply becomes s(p)~!(u,v)s((u,v)~!p) =u. Since 
different values of U turn out to give equivalent representations of G (in the sense 
explained at the end of 85.10), we take U = 0, and eqs. (7.135) - (7.136) become 


A”) = L?(R3) @H;; (7.180) 
a” (u,v,ao,a) (p) = eX @P) Di (u) P(R(u)'(p+mv)). (7.181) 
Here L(+) simply carries Lebesgue measure d*p, which is E(3)-invariant. 

The massive relativistic case is slightly more involved: we again have OF & R? 
under (@p,p) <> p, where @p = \/||p||? +m, but the Lorentz-invariant measure on 
O,. is d>p/@p. For each p eR? there is a unique boost bp € L\ that maps (m,0,0,0) 
to (@p,p), with pre-image bp in SL(2,C), so we take s(p) = bp. The Hilbert space 
is (mutatis mutandis) still given by (7.180), but instead of (7.181) we now obtain 


i”/(2),a)V(p) = el \*P)) D (b> Ab, 1) W(A |p), (7.182) 


where a = (ao,a), A € SL(2,C), and A € a the image of A under the covering 
projection. We leave the corresponding formulae for the massless case to the reader. 
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7.7 Quantization and permutation symmetry 


Another interesting application of the quantization theory developed in this chapter 
is to indistinghuishable particles. Since all elementary particles come in families 
of indistinghuishable sorts (such as electrons, photons, ...), this topic is obviously 
of fundamental importance to physics. It is also puzzling, since (as we shall see) 
mathematically one expects more possibilities than those realized in Nature (namely 
bosons and fermions). This topic is also interesting philosophically, because it ap- 
pears to be a testing ground for Leibniz’s Principle of the Identity of Indiscernibles 
(PII), which states that two different objects cannot have exactly the same properties 
(in other words, two objects that have exactly the same properties must be identical). 

After a period of confusion but growing insight, involving some of the greatest 
physicists such as Planck, Einstein, Ehrenfest, Fermi, and especially Heisenberg, 
the modern point of view on quantum statistics was introduced by Dirac. 

Using modern notation, and abstracting from his specific example (which in- 
volved electronic wave-functions), Dirac’s argument is as follows. Let H be the 
Hilbert space of a single quantum system, called a particle in what follows. The 
two-fold tensor product H? = H @ H then describes two distinguishable copies of 
this particle. The permutation group G2 on two objects, with nontrivial element 
(12), acts on the state space H? by linear extension of u(12)W @ Wo = Wo @ Wh. 
Praising Heisenberg’s emphasis on defining everything in terms of observable 
quantities only, Dirac then declares the two particles to be indistinguishable if 
u(12)au(12)* =a for any two-particle observable a; by unitarity, this is to say that 
a commutes with u(12). Dirac notes that such operators map symmetrized vectors 
(i.e. those y € H @A for which u(12)y = yw) into symmetrized vectors, and like- 
wise map anti-symmetrized vectors (i.e. those y € H @H for which u(12)y = —y) 
into anti-symmetrized vectors, and these are the only possibilities; we would now 
say that under the action of the G2-invariant (bounded) operators one has 


H’? ~ Hi @H?; (7.183) 
Hy = {ye H? | u(12)y = y}; (7.184) 
H? ={weH? | u(12)y=—y}. (7.185) 


Arguing that in order to avoid double counting (in that y and u(12)y should not 
both occur as independent states) one has to pick one of these two possibilities, Dirac 
concludes that state vectors of a system of two indistinguishable particles must be 
either symmetric or anti-symmetric. He then generalizes this to N identical particles: 
if (ij) is the element of the permutation group Gy on N objects that permutes i and 
j (i,j =1,...,N), then according to Dirac, y €¢ HY = H® should satisfy either 
u(ij)W = W, in which case y € H2, or u(ij) yy = —y, in which case y € H2, where 
u is the natural unitary representation of Gy on H™, given, on p € Gy, by linear 
(and if necessary continuous) extension of 


U(P)Wi @ ++ @ Wy = Wyi1) @ ++ @ Wy). (7.186) 
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Equivalently, y € He if it is invariant under all permutations, and y € H2 if it is in- 
variant under even permutations and picks up a minus sign under odd permutations. 
A slightly more sophisticated version of this argument often finds runs as follows: 


‘Since, in the case of indistinguishable particles, y €¢ H™ and u(p)w must represent the 
same state for any p € Gy, and since two unit vectors represent the same state iff they 
differ by a phase vector, by unitarity it must be that u(p)y = c(p)y, for some c(p) € C 
satisfying |c(p)| = 1. The group property u(pp’) = u(p)u(p’) then implies that c(p) = 1 for 
even permutations and c(p) = +1 for odd permutations. The choice +1 in the latter leads 
to bosons, whereas —1 leads to fermions, so these are the only possibilities.’ 


Alas, where Dirac’s argument is incomplete, this one is even inconsistent: the claim 
that two unit vectors represent the same state iff they differ by a phase vector, pre- 
sumes that the particles are distinguishable! Indeed, the only physical argument to 
the effect that two unit vectors y and yw’ are equivalent iff yw’ = zy with |z| = 1, is 
that it guarantees that expectation values coincide, i.e., that 


(y.ay) = (y',ay'), (7.187) 


for all (bounded) operators a, i.e., not merely for the permutation-invariant operators 
(in which case (7.187) does not follow). But, following Heisenberg and Dirac, the 
whole point of having indistinguishable particles is that an operator a represents a 
physical observable iff it is invariant under all permutations (acting by conjugation)! 

Although the above arguments therefore seem feeble at best, their conclusion that 
only bosons and fermions can exist seems validated by Nature, despite the mathe- 
matical fact that the orthogonal complement of HY @ H2 in Hy (describing particles 
with parastatistics) is non-zero as soon as N > 2. This should be a source of con- 
cern, and indeed, much research on indistinguishable particles (in d > 2) has had 
the goal of explaining away parastatistics. Distinguished by the different actions of 
Gy they depart from, these explanations have traditionally been based on: 


e Quantum observables. Gy acts on the C*-algebra B(H™ ) of bounded operators 
on H™ by conjugation of the unitary representation u(Gy) on HN, cf. (7.186). 
One implements permutation invariance by postulating that the physical observ- 
ables of the N-particle system under consideration be the Gy-invariant operators: 
with u given by (7.186), the algebra of observables is therefore taken to be 


My = B(H™)©" = {a € B(H") | [a,u(p)| = 0(p € Gy) }. (7.188) 
e Quantum states. By restriction, Gy then also acts on the (normal) state space 
S,(H™) = (A) C BAN), (7.189) 


from which it is postulated that the physical state space is Z(H") ©". 

e Classical states. Gy acts on MN, the N-fold cartesian product of the classical 
one-particle phase space M, by permutation. If M = T*Q for some configuration 
space Q, we might as well start from the natural action of Gy on QY (pulled back 
to MY), and this is indeed what we shall do, often further simplifying to Q = R?. 
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Unsurprisingly, the first two approaches equivalent. Define a linear map 


Ey : B(H™) > B(HN)®%; (7.190) 
1 
ar — y u(p)au(p)*; (7.191) 
Ml cGy 


this is a (normal) conditional expectation from the von Neumann algebra B(H™’) 
to the von Neumann algebra B(H")©", i.e., Ey(a*) = Ey(a)* for all a € B(H), 
E2, = Ey, and ||Ey|| = 1. Moreover, Ey preserves positivity as well as the trace, so 
that it also maps the state space Z(H) onto the invariant states Z(H”) C B(H™). 
Simple computations also establish the properties 


Tr (pa) = Tr (Ey(p)a) (p € D(H™), a € BCH) ©"); (7.192) 
Tr (pa) = Tr(pEy(a)) (p € D(A™)9", a € B(H)). (7.193) 


Finally, the reduction of H” under u(Gy) described below may equally well be de- 
scribed in terms of the state space, since a subspace eH Cc H™ (where e € P(H") is 
a projection) is stable under u iff e € A(H™)©N, in which case it may be described 
in terms of the associated density operator p = e/Tr(e) € A(H™)©". With some 
more effort, in can be even be shown that p € 0.((H”) ©") iff eH is irreducible. 
We may therefore focus on the first and the third approaches, starting with the 
first, based on (7.188). Note that the C*-algebra of invariant compact operators, 1.e., 


An = Bo(H™)©N = {a € Bo(H®) | [a,u(p)| = 0(p € Gy)}, (7.194) 


induces the same decomposition of H™ as My does (since M = AN), so if A is 
infinite-dimensional one may use Ay rather than My as the algebra of quantum ob- 
servables; this is convenient for comparison with the classical state space approach. 

As long as dim(H) > 1 and N > 1, the algebras My and Ay act reducibly on 
H®.. The reduction of H under My (and hence of Ay and of u(H)") is traditionally 
carried out by Schur duality. This rests on the following concepts. 


Definition 7.11. ¢ A partition A of N is a way of writing 
None (7.195) 


e The corresponding frame (or Young diagram) Fy is a picture of N boxes with 
n; boxes in the i’th row, i=1,...,k. 

e For each frame F,, one has N! possible Young tableaux T, each of which is a 
particular way of writing all of the numbers I to N into the boxes of Fy. 

e A Young tableau is standard if the entries in each row increase from left to 
right and the entries in each column increase from top to bottom. The set of 
all (standard) Young tableaux on F, is called Fy (Fy 

e To each T € % we associate the subgroup Row(T) C Gy of all permutations 
PD € Gn that preserve each row (i.e., each row of T is permuted within itself); 
likewise Col(T) C Gy consists of all p € Gy that preserve each column. 
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The set Par(N) of all partitions A of N parametrizes the conjugacy classes of Gy 
and hence also the (unitary) dual of Gy; in other words, up to (unitary) equivalence 
each (unitary) irreducible representation u, of Gy bijectively corresponds to some 
partition A of N; the dimension of any vector space V, carrying ug is N, = |.F;°|, 
that is, the number of different standard Young tableaux on the frame F). 

Returning to (7.186), to each A € Par(N) and each Young tableau T € .% we 
associate an operator er on Hy by the formula 


er=— JY sgn(p)u(p) Y u(p’), (7.196) 


* peCol(T) p'€Row(T) 


which happens to be a projection. Its image epH™ C H is denoted by HN, and the 
restriction of My to H}’ is called My(T). One may now write the decomposition of 
H® under the action of My (up to unitary equivalence) as 


HY= @ Ape, (7.197) 
Ae€Par(N) 
Mv= @ My(Q)@ly,, (7.198) 
Ae€Par(N) 
u(y) = @ luv @ug, (7.199) 
AePar(N) + 


where the labeling is by the partitions A of N, the multiplicity spaces V, are ir- 
reducible Gy-modules, and 7, is an arbitrary choice of a Young tableau defined 
on F,. For simplicity we here assume that dim(H) > N; if dim(H) < N, then only 
partitions (7.195) with k < dim(H) occur. For example, the partitions (7.195) of 
N =2 are 2=2 and 2 =1+1, each of which admits only one standard Young 
tableau, which we denote by S and A, respectively. With Nz = Nj.) = 1 and hence 
V, =Vi+1 =C as vector spaces, this recovers (7.183); the corresponding projections 
e+ and e_, respectively, are given by e+ = 5(1+u(12)) and e_ = 5(1—u(12)). The 
bosonic states y+, 1.e., the solutions of yw, € He. or e+ Wy = Wy, are just the sym- 
metric vectors, whereas the fermionic states y_ € H2 are the antisymmetric ones. 
These sectors exist for all N > | and they always occur with multiplicity one. 
However, and this is the bite of the topic, for N > 3 additional irreducible rep- 
resentations of My appear, always with multiplicity greater than one; states in such 
sectors are said to describe paraparticles and/or are said to have parastatistics. For 
example, for N = 3 one new partition 3 = 2+ 1 occurs, with N2,; = 2, and hence 


H? =H? OH? OHBOHZ, (7.200) 


where Hp and H;, are the images of the projections ep = (1 — u(13))(1 +u(12)) 
and ep = (1 —u(12))(1 + u(13)), respectively. The corresponding two classes of 
parastates (i.e. states carrying parastatistics) Wp and wp then by definition satisfy 
epWp = Wp and ep Wp = wp”, respectively. In other words, the Hilbert spaces carry- 
ing each of the four sectors are the following closed linear spans: 


7.7 Quantization and permutation symmetry 279 


A} = span” {Wis + Yais + Waar + Va12 + Wise + Vo}; (7.201) 
H? = span” {Wi23 — Yo13 — W321 + Ya12 — Vis2 + Wasi (7.202) 
Hp = span” {W123 + Wo13 — W321 — Ys12} (7.203) 
Hp = span {Wi23 + W321 — Wo13 — Wo31}, (7.204) 


where Wijc = Wi ® Wj ® Yj, and the y; vary over H (and span“ is closed linear span). 
For any N > 2, let us note that instead of the decomposition (7.197) - (7.198), 
which is defined up to unitary equivalence, one may alternatively decompose H™ as 


HN = QD He: (7.205) 
Te ZS AcPar(N) 
My = BB My(7), (7.206) 


Te ZS AcPar(N) 


which has the advantage over (7.197) - (7.198) that the Hn are subspaces of HN, 
The disadvantage is that My (7) is unitarily equivalent to My(T") iff T and T’ both 
lie in ie (i.e., for the same A), so that unlike (7.197) - (7.198), the decomposi- 
tion (7.205) - (7.206) is non-unique (for example, Young tableaux different from 
standard ones might have been chosen in the parametrization). The analogue of the 
third line (7.199) in the earlier decomposition would therefore be a mess. Indeed, 
although Gy maps each of the subspaces H+ and H_ into itself (the former is even 
pointwise invariant under Gy, whereas elements of the latter at most pick up a minus 
sign), this is no longer the case for parastatistics. For example, for N = 3 some per- 
mutations map HB into He, and vice versa. This is clear from (7.205) - (7.206): for 
A =P, one has dim(Vp) = 2, and choosing a basis (1, D2) of Vp one may identify 
Hp® and H;,° in (7.205) with (say) Hp? ® v1 and Hp* ® v2 in (7.197), respectively. 
And analogously for N > 3, where dim(V,) > 1 for allA #S,A. 

A (or perhaps the) competing approach to permutation invariance in quantum 
mechanics starts from classical (rather than quantal) data. Let Q be the classical 
single-particle configuration space, e.g., Q = R¢; to avoid irrelevant complications, 
we assume that Q is a connected and simply connected manifold. The associated 
configuration space of N identical but distinguishable particles is @Y. Depending 
on the assumption of (in)penetrability of the particles, we may define one of 


Oy = O"/Gy; (7.207) 
On = (Q%\An)/Gn, (7.208) 


as the configuration space of N indistinguishable particles, where Aj is the extended 
diagonal in Q%, i.e., the set of points (q1,...,gv) € Q™ where q; = qj for at least 
one pair (i, j), i ~ j (so that for Q = R and N = 2 this is the usual diagonal in R7). 
At first sight, these two choices should lead to exactly the same quantum theory, 
based on the Hilbert space L?(Oy) = L?(Qy), since Ay is a subset of measure zero 
for any measure used to define L” that is locally equivalent to Lebesgue measure. 
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However, the effect of Ay is noticeable as soon as one represents physical observ- 
ables as operators on L? through any serious quantization procedure, which should 
be sensitive to both the topological and the smooth structure of the underlying con- 
figuration space. In the case at hand, Qy is multiply connected as a topological 
space, but as a manifold it is smooth and has no singularities. In contrast, On iS 
simply connected as a topological space, but in the smooth setting it is a so-called 
orbifold. This leads to interesting complications, but following tradition (i.e., in the 
configuration space approach to indistinguishable particle) we continue with Oy. 

To quantize Qy we use the language of Lie groupoids and their C*-algebras, cf. 
§$C.16—-C.17. Let Q be any (possibly) multiply connected manifold, with universal 
covering space Q. In particular, the first homotopy group 7) (Q) acts (say from the 
right) on Q in such a way that Q = Q/7(Q). We denote the canonical projection by 
a: Q — Q. One may have the example Q = T, Q = R, 2 (Q) = Z in mind here. 

AS a Variation on the pair groupoid G = Q x Q, we now consider the Lie groupoid 


Go =O Xz,(0) 9, (7.209) 


whose elements are equivalence classes [41,42] in Q x Q under the equivalence rela- 
tion ~ defined by (91,42) ~ (4,95) iff G1 = qx and G2 = 5x for some x € 7 (Q); 
the source and target projections are s((G1,G2]) = (G2) and t([91, G2]) = 1(G1), re- 
spectively, the inverse is [1,42] ~° = [42,41], and multiplication is the obvious one 


borrowed from the pair groupoid QO x Q over Q (which is well defined on Go). The 
tangent groupoid Go of Go (cf. Proposition C.117) has the following fiber at # = 0: 


(Go)p =TQ, (7.210) 


to be contrasted with the corresponding fiber Gs = TQ of the pair groupoid on the 
covering space Q. In particular, for our configuration space Q = Qy we have 


Goy = ON Xm (oy) ON’ (7.211) 
(Goy)é = TOn, (7.212) 


which gives the fibers of the corresponding continuous bundle of C*-algebras as 


Ao = Co(T*On) (h = 0); (7.213) 
=C*(Gg) (0<h<1), (7.214) 


cf. §C.19. This gives a generalization of the fibers (7.17) - (7.18) for Q = R”, and 
also now we have an example of Definition 7.1: the fibers (7.213) - (7.214) com- 
bine to form a continuous bundle of C*-algebras with total C*-algebra A = C* (GO), 
yielding a deformation quantization of the Poisson manifold T*Qw (i.e., the usual 
phase space defined by the configuration space Qy). We now define the inequiva- 
lent quantizations of Qy as the inequivalent irreducible representations of the cor- 
responding C*-algebra of quantum observables C* (Gon); as follows. 
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Theorem 7.12. /. Let Q be multiply connected. The inequivalent irreducible repre- 
sentations m* of the C*-algebra C*(Go) bijectively correspond to the inequiva- 
lent irreducible unitary representations uy, of the first homotopy group ™(Q). 

2. Each representation n* has a natural realization on the Hilbert space 


H’ =17(Q)@H,, (7.215) 


where Hy) is a specific carrier space for the representation uy. More fancifully, 
one may use the Hilbert space L?(Q,E,) of L*-sections of the vector bundle 


E, =OXm(g) Ha (7.216) 


associated to the principal bundle x: Q —> Q by the representation uj. 


Provided one accepts (7.208), this theorem in principle gives a complete solution 
to the problem of quantizing multiply connected configuration spaces, and hence, 
taking Q = Qy, of the problem of quantizing systems of indistinguishable particles. 


Proof. We just prove Theorem 7.12 in the case we need, where 7 (Q) is finite. Then 


C*(OXx,(9) O) & Bo(L?(G))"; (7.217) 
Bo(L?(Q))™® & Bo(L?(Q)) @C*(m (Q)), (7.218) 


where (in our usual notation) Bo(L?(Q))™ 2) is the C*-algebra of 7 (Q)-invariant 
compact operators on L?(Q), and C*(z1(Q)) is the group C*-algebra of 7)(Q) 
(which is finite-dimensional and hence nuclear, given the assumption that 7 (Q) 
is finite, so that the choice of the C*-algebraic tensor product does not matter). 

To prove (7.217), we first exploit finiteness of 7 (Q) in order to identify functions 
a € C?(Gg) with constrained C? functions a on Q x Q that satisfy 


a(qh, qh) =a(4,q') (he m(Q)). (7.219) 


This identification is explicitly given by 


a(9,7') = 4((4,7)), (7.220) 


where [g,q’] denotes the equivalence class of (9,q’) € @ x Q under the diagonal 
action of 7(Q). on makes the space C?(Gg) a ne subset of C*(Gg). We write 
a € C?(Q x O)™ 2); for (7.208) this just means that a is a permutation-invariant 
kernel. Second, we equip Q with some measure d@ that is locally equivalent to the 
Lebesgue measure, and in addition is 7 (Q)-invariant under the regular action R of 
71 (Q) on functions on Q, given, as usual, by R,W(g) = (gh). In that case, one also 
has a measure dq on Q that is locally equivalent to the Lebesgue measure, so that 
the measures dg and dq on Q and Q, respectively, are related by 


[at0- ze 


yy, dq f(s( (7.221) 


)| hem (Q 


282 7 Limits: Small A 


Here f € C.(Q), |21(Q)| is the number of elements of 7; (Q), and s : Q — Q is any 
(measurable) cross-section of tT: @ + Q. We may then define a Hilbert space L” (Q) 
with respect to dg, on which elements a of C?(Q x Q)™!(2) act faithfully by 


aya) = [ad aa.a¥a). (7.222) 


The product of two such operators is given by the multiplication of the kernels on 
Q, and involution is defined as expected, too, namely by hermitian conjugation: 


a‘ (4,7) =a(q',9). (7.223) 


The norm-closure of C?(@ x Q)"!(2), represented as operators on L?(Q) by (7.222), 
is then given by Bo(L?(Q))™!(2). This proves (7.217). 

Eq. (7.218) is a special case of the following: let X be a manifold carrying a 
free action of a compact group G. If L?(X) is defined by some G-invariant “locally 
Lebesgue” measure on X, as in the construction above, then one has an isomorphism 


Bo(L?(X))% & Bo(L?(X/G) @C*(G). (7.224) 


This is proved in a similar way, realizing Bo(H) as the norm-completion of the 
Hilbert-Schmidt operators B2(H) (for general H), and, in the L?-case at hand, iden- 
tifying B2(L7(X)) with the algebra of operators with kernels in L?(X x X). 

Part 2 of the theorem now follows from the fact that for any Hilbert space H the 
C*-algebra Bo(H) of compact operators on H has exactly one irreducible represen- 
tation (up to unitary equivalence), i.e. the defining one (this can be proved in many 
ways, e.g. from Rieffel’s theory of Morita equivalence of C*-algebras), combined 
with the bijective correspondence between continuous unitary representations u of 
any locally compact group G and non-degenerate representations of its associated 
group C*-algebra C*(G); see §C.18, Definition C.119 etc. 


As mentioned in Theorem 7.12, there are two ways of realizing the Hilbert space 
H*, where A labels some irreducible representation of 7(Q). This is very similar 
to the discussion in §7.5, so we will be relatively brief here. The first realization 
corresponds to having constrained wave-functions defined on the covering space Q; 
for example, the usual description of bosonic or fermonic wave-functions is of this 
sort. The second realization uses unconstrained wave-functions on the actual con- 
figuration space Q (bad hombres confusingly call such functions “multi-valued”. 


1. The space C*(Q,E,) of smooth cross-sections of E, may be given by the 
smooth maps W: Q —> Hy satisfying the equivariance condition (“constraint”) 


WGh) =u (h')W(), (7.225) 
for all h € 1(Q), Gg € Q. The Hilbert space 


H? =17(0,H,)™, (7.226) 
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then, is defined as the usual L?-completion of the space of all # € I (Q,E,) for 
which (W, W) <e. The irreducible representation 2*(C*(Gg)) is then given on 
elements G of the dense subspace C? (Gg) of C*(Gg) by the expression 


maya) = fag’ a((a.a'l)w(a): (7.221) 
Q 


any 7 (Q)-invariant operator on L?(Q) acts on H? in this way (ignoring H,). 

If x; (Q) is finite, then two simplifications occur. Firstly, Hy, is finite-dimensional, 
and secondly each Hilbert space H A may be regarded as a subspace of L7(Q); the 
above action of C*(Gg) on H 4 is then simply given by restriction of its action on 
L?(Q). In that case one may equivalently realize this irreducible representation 
in terms of the right-hand side of (7.217), in which case the action of 2+ (a) on 
H? as defined in (7.226) is given by 


m(a)w(q) = | aa'ala.d (a). (7.228) 
Q 


This is true as it stands if a € C?(O x O)™ ©), cf. (7.219), and may be extended 
to general 7) (Q)-invariant compact operators a € Bo(L?(Q))™2) by norm con- 
tinuity, and, furthermore, even to B(L7(Q))"'@) by strong or weak continuity. 

2. Elements of the Hilbert space L?(Q, H, )™! (2) are typically (equivalence classes 
of) discontinuous cross-sections of E,. Possibly discontinuous cross-sections 
may simply be given directly as functions y: Q — Hj, with inner product 


(y.9) = dq(w(q).0(q))H,.- (7.229) 


This specific realization of L”(Q,E) will be denoted by L?(Q) @ Hy. If H, =C, 
L’(Q)@Hy =L’(Q). (7.230) 


These equivalent descriptions of n may be related once a (typically discontinuous) 
cross-section o : Q —> Q of the projection t : @ — Q has been chosen (i.e., To 6 = 
idg), in which case y(q) = W(o(q)). We formalize this in terms of a unitary 


u:L?(0,H,)™ 2) + 1?(Q) @Hy (7.231) 
uy(q) = W(o(q)); (7.232) 
uw! y(g) = ua (h(a), (7.233) 


where g = T(@), and / is the unique element of 2;(Q) for which gh = o(q). The 
action 22 (a) = un*(a)u—! on L?(Q) ® Hy, now follows from (7.228) - (7.233): If a 
is a 7 (Q)-invariant kernel on L?(Q), then using (7.221) we obtain 


m(a)w(q)= [ dq' a(o(q),0(q')h)ua (h)y(q'). (7.234) 
hem(Q)* 2 
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We now apply this formalism to N indistinguishable particles moving on the 
(single-particle) configuration space R>. Eq. (7.208) then gives the N-particle space 


On = ((R°)% — Ay) /Gy. (7.235) 
The universal covering space of this multiply connected space is 
On = R3% = (RN — Ay, (7.236) 
which (unlike its counterpart in d = 2) is connected and simply connected, so that 
7 (Qn) = Gn. (7.237) 
It follows from (7.217) and (7.237) that the algebra of observables is given by 
C*(Goy) = Bo(L?(R°)°") ©". (7.238) 


Comparing (7.238) with (7.194), we obtain a complete equivalence between the 

“quantum observables” approach and the deformation quantization approach based 

on Theorem 7.12, in that the configuration space approach through the representa- 

tion theory of the groupoid C*-algebra C* (Goy) leads to the same classification as 

the “quantum observables” approach based in (7.188) above, cf. (7.197) - (7.199). 
We discuss a few interesting special cases. 


N=1. Here Q; = Q; = R* and m(Q)) = {e}, so the algebra of observables is 
C*(Gg,) = Bo(L*(R°)), (7.239) 


which has a unique irreducible representation on L?(R3). 


N=2. This time, the pertinent homotopy group is 
(Qo) = G2 = Zp = {e, (12)}, (7.240) 


which has two irreducible representations: firstly, ug(p) = 1 for both p € Go, 
and secondly, ur (e) = 1, ur(12) = —1, each realized on H, = C. Hence with 
q = (x,y,z) € R?, eq. (7.225) yields 


He = {we L’(R*)’ | w(q2,01) = W(a1,.42)}; (7.241) 
Hp = {we L’(R*)’ | w(@,a1) =—W(4a1,92)}- (7.242) 


Here L?(R*)? = L?(R*) ® L? (RR?) & L?(R°). The C*-algebra 


C*(Go,) = Bo(L?(R’) @L7(R*))™ = Bo(L7(R° x R?)) (7.243) 


consists of all G2-invariant compact operators on L?(R? x R>), acting on ie or 
H it in the same way as they do on L?(R°); cf. (7.228), noting that the constraints 
in (7.241) and (7.242) are preserved due to the G2-invariance of A € C* (Go,). 
This recovers Dirac’s description of statistics given earlier in this section. 
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N=3. _ Here we have a non-abelian homotopy group 


™(Q3) = S3, (7.244) 


which, besides the irreducible boson and fermion representations on C, has an 
irreducible parafermionic representation up on Hp = C?. This representation is 
most easily obtained explicitly by reducing the natural action of G3 on C?. Define 
an orthonormal basis of the latter by 


éeg9= =] 1)];ea=——= 1 3@€9=— = 7.245 
0 A 1 5 = 2 Te ( ) 


It follows that C - eg carries the trivial representation of G3, whereas the linear 
span of e; and e2 carries a two-dimensional irreducible representation up, given 
on the generators (12), (13), and (23) of G3 by 


ww(2)=3(_ ig 4) sue3)=3 (J YF) sues) = (917). 
(7.246) 


We already gave realizations of the Hilbert space He of three parafermions 
in (7.203) and (7.204),where it emerged as a subspace of L?(IR*) @ L?(R?) @ 
L?(R>) = L’(R3 x R? x R?). An equivalent realization H? = H} may be given 
on the basis of (7.225), according to which H” is the subspace of L(IR*)? @C? & 
L?(R°) @ C? that consists of doublet wave-functions y; (i = 1,2) that satisfy 


WilIp(1)+Ip -¥ uij(P) Wj (41542593), (7.247) 


for any permutation p € 63, where u = up, cf. (7.246). L.e., the parafermionic 
wave-functions in this realization of H 5 are constrained by the conditions 


Wi(42.91,93) = 4Vi(41,.92,93) — 5 V3 Wo(q1,42,93): (7.248) 
Wo(q2,91,93) = —4V3 Wi(gi.g2.43) — 4¥2(91,92,93); (7.249) 
Wi (93,492,491) = 4W1(41,92,93) +43 Wo(G1,492.93); (7.250) 
W2(q3,92,91) = 4V3 Wi (41,42,93) — 4¥2(41,92593); (7.251) 
Wi(q1,93,42) = —Wi(g1, 92,93): (7.252) 
W2(41,93,42) = W2(41,92,93)- (7.253) 


The algebra of observables C* (Go,) of three indistinguishable particles without 
internal degrees of freedom, i.e., then acts on HP c (RS ® C? asin (7.234), 
identifying a € C* (Go,) with a ® lz (so that a ignores the internal degree of 
freedom C7’). This representation 7” is irreducible by Theorem 7.12. 


286 7 Limits: Small A 


N>3. The above construction may be generalized to any N > 3. There will now 
be many parafermionic representations u, of Gy (given by Young tableaus), each 
of which induces an irreducible representation of the C*-algebra (7.238). 


The question now arises whether parastatistics is to be found in Nature—or, in- 
deed, if this question is even well defined! As a warm-up to the case N = 3, where 
the question first plays a role, let us give an alternative realization of 2" (C*(Go,)), 
cf. Theorem 7.12. Take two isospin doublet bosons (which by definition transform 
under the defining spin-} representation D, /2 of SU(2) on C?). With 


H®) = (L?(R3) @C?)®, (7.254) 


and using indices a},a2 = 1,2, the Hilbert space of these bosons is 


HY) = {y E H®) | (Waray (92,41) = Waa (91,92)}; (7.255) 


with corresponding projection ee) :H?) > H (2) given by 


ee) Wayar (41,42) = 3 (Waray (92141) + Wayar(91,92)): (7.256) 


Subsequently, define a partial isometry w : H) > L?(R3)®? by 
1 
wW(q1,92) = Wo(q1,92) = a vi2(41 42) — Wo1(q1,4q2))- (7.257) 


Physically, this singles out an isospin singlet Hilbert subspace H () = eg within 
H), where ey = w*w (which is a projection). This singlet subspace may be con- 
strained to the bosonic sector by passing to 

HO) = eye H®; (7.258) 
note that e9 and e) commute. Now extend the defining representation of C* (Go,) 
on L?(R3)® to H®) by ignoring the indices aj,qa2 (i.e., isospin is deemed unob- 


servable). This extended representation commutes with eg and with e), and hence 


is well defined on HY) cH), Let us denote this representation of Go, by ne), It 
is then immediate from the property Wo(q2,91) = —Wo(q1,q2) that: 


Proposition 7.13. The representations x) (C*(Gg,)) on H) and 1" (C*(Gg,)) on 


H are unitarily equivalent. 


In other words, two fermions without internal degrees of freedom are equivalent 
to the singlet state of two bosons with an isospin degrees of freedom, at least if 
the observables are isospin-blind. Similarly, two bosons without internal degrees of 
freedom are equivalent to the singlet state of two fermions with isospin, and two 
fermions without internal degrees of freedom are equivalent to the isospin triplet 
state of two fermions (this corresponds to the Schur decomposition of (C*)®? under 
the commuting actions of G2 and SU(2)). 
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For N = 3 we may carry out a similar trick as for N = 2, and replace parafermions 
without (further) degrees of freedom by either bosons or fermions. We discuss the 
former and leave the explicit description of the various alternative descriptions to 
the reader. We proceed as for N = 2, mutatis mutandis. We have a Hilbert space 


H®) = (17(R3) @C?)®3, (7.259) 


of three distinguishable isospin doublets, containing the Hilbert space H, 3) of three 
bosonic isospin doublets as a subspace, that is, 


HY — {y € H®) | Way) ap(2)4p(3) (4p(1)»Fp(2)»4p(3)) = Wayana3 (91,92,93) (p € G3)}. 

(7.260) 
The corresponding projection, denoted by e) -HS) =H An will not be written 
down explicitly. Define an SU(2) doublet (y, y2) within the space H°) through a 
partial isometry 


w:H?) > 17(R3)3 @C?; (7.261) 
1 
WwW (41,92,93) = yy V2 (M142,43) — Wi12(91,92,93))5 (7.262) 


1 
WW(41,92,93) = Jee M2 (4142043) + Wi21(91,92,93) + Wi12(91,92593))- 
(7.263) 


Defining a projection ez = w*w on H (3), the Hilbert space H (3) contains a closed 


subspace HY) = ee.) H (3), which is stable under the natural representation of 
C*(Go,) (since e2 and e) commute). We call this representation ne), An easy 


calculation then establishes: 


Proposition 7.14. The representations no) (C*(Gg,)) on HY) and m(C*(Gg,)) on 
H? (as defined by Theorem 7.12) are unitarily equivalent. 


In other words, three parafermions without internal degrees of freedom are quiva- 
lent to an isospin doublet formed by three identical bosonic isospin doublets (corre- 
sponding to the Schur decomposition of (C)®? under the commuting actions of 63 
and SU(2); in this decomposition, the spin 3/2 representation of SU(2) couples to 
the bosonic representation of G3, whilst the spin-5 representation of SU (2) couples 
to the parafermionic representation of 63), at least if the observables of the latter 
are isospin-blind. Many other realizations of parafermions in terms of fermions or 
bosons with an internal degree of freedom can be constructed in a similar way. 

For N > 3 we similarly find that the representation of the C*-algebra (7.238) 
induced by some parafermionic representations uy of Gy is unitarily equivalent 
to a representation on some SU (n) multiplet of bosons with an internal degree of 
freedom; the appropriate multiplet is the one coupled to wy in the Schur reduction 
of (C”)®" with respect to the natural and commuting actions of Gy and SU (n). 
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The moral of this story is that one cannot tell from glancing at some Hilbert space 
whether the world consists of fermions or bosons or parafermions; what matters is 
the Hilbert space as a carrier of some (irreducible) representation of the algebra 
of observables. From that perspective we already see for N = 2 that being bosonic 
or fermionic is not an invariant property of such representations, since one may 
freely choose between fermions/bosons without internal degrees of freedom and 
bosons/fermions with internal degrees of freedom. In a more systematic discussion 
using superselection theory one may impose some physical selection criterion in or- 
der to restrict attention to “physically interesting” sectors. Such criteria (which, for 
example, would have the goal of excluding parastatistics) should be formulated with 
reference to some algebra of observables. Such issues cannot be settled at the level 
of quantum mechanics and instead require quantum field theory, where parastatis- 
tics can always be removed in terms of either bose- or fermi-statistics, in somewhat 
similar vein to our discussion. For (nonlocal) charges in gauge theories there are no 
rigorous results, but historically a similar goal played a role in the road to quantum 
chromodynamics (QCD), which is one of the ingredients of the Standard Model. 

A different argument against parastatistics arises from the state space approach 
based on the compact convex set (H™”)©N studied at the beginning of this sec- 
tion. The extreme boundary 0 (Z(H”)©") consists of one part that is contained 
in 0.9(H’) = Y\(H%), and one that is not. The first part consists of those one- 
dimensional invariant projections e € A,(H”)©" whose image eH belongs to ei- 
ther the bosonic subspace HY (in which case u(p)e = e for each p € Gy) or the 
fermionic subspace H® of H™ (in which case u(p)e = sgn(p)e for each p € Gy); 
in other words, pure bosonic on fermionic states on B(H™)©N are also pure on 
B(H). The second part, then, consists of parastatistical pure states on B(H™)©N, 
which are therefore mixed on B(H™). Furthermore, pure bosonic or fermionic 
states on B(H™)©" both extend and restrict to pure bosonic or fermionic states on 
B(HN+!)©w+1 and B(H‘—!)©v-1, respectively, whereas parastatistical pure states 
turn out to have neither property and hence are “isolated” at the given value of N. 

Finally, in d = 2 the equivalence between the operator and configuration space 
approaches breaks down, because Gy # 7 (Qy) = Bn, ie., the braid group on N 
strings. Even defining the operator quantum theory on Hy = L*(Qy), with algebra 
of observables My = B(L?(Qy))8", fails to rescue the equivalence, because the de- 
composition of Hy under My by no means contains all irreducible representations 
of By. In this case deformation quantization gives many more sectors than the im- 
proved operator approach (which already gave more sectors than the approach using 
‘multi-valued’ scalar wave-functions). 
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Notes 


The quotations in the preamble are from Dirac (1947), p. 87. Similarly, the Dreimdnner- 
arbeit (Born, Heisenberg, & Jordan, 1926) bluntly states (in Ch. 1, §1) that: 


‘one can see from eq. (5) [i.e., pg— gp = —ih- 1, cf. our eq. (7.5)] that in the limith =0, 
the new theory would converge to the classical theory, as is physically required.’ 


87.1. Deformation quantization 

In the wake of Dirac’s famous insight on the analogy between the Poisson bracket 
and the commutator in quantum mechanics, the idea of deformation quantization (in 
the form of what we now call star products) may be traced back to Groenewold 
(1946) and Moyal (1949). The mathematical (physics) literature on the subject 
started with Berezin (1975) and Bayen et al (1978), who introduced what we now 
call formal deformation quantization, in which hi is not a real number but a formal 
parameter occurring in formal power series. The C*-algebraic setting for deforma- 
tion quantization we use was introduced by Rieffel (1989, 1994); see also Landsman 
(1998a), Chapter 2, for a detailed treatment. 


§7.2. Quantization and internal symmetry 
This section is based on Rieffel (1990) and Landsman (1998a), Chapter 3. 


§7.3. Quantization and external symmetry 
87.4. Intermezzo: The Big Picture 
87.5. Induced representations and the imprimitivity theorem 
87.6. Representations of semi-direct products 
The action Poisson bracket (7.58) was introduced by Krishnaprasad & Marsden 
(1987); see also Marsden & Ratiu (1994). 


Systems of imprimitivity and their applications to representation theory, semi- 
direct products, and quantum mechanics are due to Mackey (1958, 1968), who 
was inspired by Weyl (1927, 1928), von Neumann (1932), and Wigner (1939). As 
Mackey (1978, 1992) describes, he saw his work as the development of what he 
calls Weyl’s Program. Wey] (1927) posed two questions in quantum mechanics: 


1. ‘How to construct the matrix of Hermitian form! that represents some quantity 
given in the context of a known physical system?’ 

2. ‘Given this Hermitian form, what is their physical meaning, and which physical 
statements can we make about it?’> 


Weyl considered the second question to have been resolved by von Neumann’s 
recent work, and so he concentrated on the first, which he tried to answer using 
group theory. The main achievement of Wey! (1927), elaborated in his subsequent 


' Like Hilbert himself, Wey] at the time still thought of operators in terms of matrices or Hermitian 
forms, rather than abstractly, like von Neumann. Also cf. our Introduction. 

2 ‘Wie komme ich zu der Matrix, der Hermiteschen Form, welche eine gegebene GriBe in einem 
seiner Konstitution nach bekannten physikalischen System reprasentiert?’ (Weyl, 1927, p. 1) 

3 ‘Wenn einmal die Hermitesche Form gewonnen ist, was ist ihre physikalische Bedeutung, was 
fiir physikalische Aussagen kann ich ihr entnehmen?’ (ibid.) 
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book Weyl (1928), was a reformulation of the canonical commutation relations 
i[p,q] =h- Ly in terms of projective unitary representations of the additive group 
R? (or, equivalently, of unitary representations of the associated Heisenberg group). 
He also introduced the formula (7.21) in an equivalent form where the (classical) 
Fourier expansion of f, i.e., 


f(p.4) = | dadbei*?*™ F(a,b), (7.264) 
R 


is “quantized” by the operator in which exp(iap + ibq) in the above formula is re- 
placed by the (projective) unitary representative u(a,b) of (a,b) € R? just men- 
tioned, i.e., the real numbers p and q are replaced by the corresponding operators p 
and @, as in (7.3) - (7.4). In particular, Wey] treated p and g symmetrically. 

In his development of Weyl’s Program, Mackey broke the symmetry between p 
and q, in that he saw the momentum operator / as the (“infinitesimal”) generator 
of a unitary representation of the additive group R, whereas the position operator 
g was replaced by a projection-valued measure on the real line; this is equivalent 
to a nondegenerate representation of the commutative C*-algebra Co(Q), as in our 
discussion in §7.3. This way of tearing p and g apart was the key to the general case 
of quantizing group actions on configuration space discussed in §7.3. 

In their independent elaboration of Weyl’s ideas, Groenewold (1946) and Moyal 
(1949) emphasized the deformation aspect of quantization (including the classical 
limit) rather than its group-theoretical underpinning; the former aspect is completely 
absent in Mackey’s work. “The Big Picture” (Landsman, 1998a, Ch. 3; Landsman & 
Ramazan, 2001; Landsman, 2007) is an attempt to have the best of both worlds, in 
that the role of Lie groupoids delivers the symmetry aspect of quantization, whereas 
our (i.e. Rieffel’s) very definition of quantization puts the deformation aspect in the 
front seat. The underlying theory of Lie groupoids and Lie algebroids may be found 
in Moerdijk & Mréun (2003) or Mackenzie (2005); see also Landsman (1998a). 


A comprehensive study of the Mackey—Glimm dichotomy may be found in 
Williams (2007), which contains a wealth of information on crossed product C*- 
algebras and induced representations in general. 


The representation theory of the Poincaré-group was first studied (using some- 
what heuristic methods) by Wigner (1939) using induced representations. The entire 
subject was subsequently taken up and finished by Mackey. For treatments in the 
spirit of (mathematical) physics see e.g. Simms (1968), Niederer & O’ Raifeartaigh 
(1974), and Barut & Racka (1977). Lemma 7.10 is proved by Bargmann (1954). 


Among the known elementary particles, the case j = 0 (and m > 0) corresponds 
to the Higgs boson, whereas j = 5 gives all known fermionic particles (i.e., elec- 
trons, quarks, neutrino’s, and their antiparticles). If one counts the gauge bososn W1. 
and Zy as massive, they provide the case j = 1, but in the fundamental Lagrangian 
they are massless and correspond to helicity n = +1, like the photon. Helicity +2 
gives the graviton. We discard particles predicted by supersymmetry, which evi- 
dently does not exist in nature (this evidence seems lost on string theorists). 
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§7.7. Quantization and permutation symmetry 

This section is based on Landsman (2016a). The literature on indistinguishable 
particles is enormous, initiated by Heisenberg (1926) and Dirac (1926). What we 
call the “quantum observables” approach goes back to Messiah & Greenberg (1964); 
see also Driithl, Haag, & Roberts (1970). Key papers in the configuration space 
approach are Souriau (1967), Laidlaw & DeWitt-Morette (1971) and Leinaas & 
Myrheim (1977). More generally, for the quantization of multiply connected space 
see Dowker (1972), Schulman (1981), Isham (1984), Horvathy, Morandi, & Su- 
darshan (1989), Morchio & Strocchi (2007), and Morandi (1992). The state space 
approach to indistinguishable particles was proposed by Bach (1997), who proves 
(7.192) - (7.193), as well as the claim following these equations to the effect that 
p €aeD(HN )\Sn iff eH is irreducible. The state space arguments against parastatis- 
tics given near the end of this section are also due to Bach (1997). 

The representation theory used in this section may be found in many books, such 
as Weyl (1928), Fulton (1997), or Goodman & Wallach (2000). 

The groupoid (7.209) is a special case of the so-called gauge groupoid defined 
by a principal H-bundle P 4 Q, where G; = P xy P (which stands for (P x P)/H 
with respect to the diagonal H-action on P x P), Go = Q, and the operations are 


s([p.q]) = (4), t([p.4)) =2(p), (xy! = pal, [pala] = [p.1: 


here [p,q][q',r] is defined whenever z(q) = 2(q'), but to write down the product 
one picks some element g € 2~!(q'). 


Recent philosophical literature on indistinguishable particles includes French & 
Krause (2006), Earman (2010), Caulton & Butterfield (2012), Saunders (2013), and 
Baker, Halvorson, & Swanson (2015). This philosophical literature stills needs to 
be integrated with the mathematical approach launched in this section, and it was 
indeed the goal of Earman, Halvorson, & Landsman (2013ish) to do so. Alas! 


Chapter 8 
Limits: large N 


Beside the limit i — 0, we consider the limit N — oo, where N could be the principal 
quantum number labeling orbits in atomic physics (as in Bohr’s Correspondence 
Principle), or the number of particles or lattice sites, or the number of identical 
experiments in a long run measuring the relative frequencies of possible outcomes. 

The case of large quantum numbers will be dealt with first: as our toy model 
of an classical orbit we take a coadjoint orbit in the dual g* of the Lie algebra g 
of a compact connected Lie group G, see §5.9; for G = SU(2) or SO(3) these are 
simply two-spheres S?. The corresponding quantum theories are indexed by their 
spin j = 5n, where n € N, which we send to infinity in order to recover the classical 
orbit. This can be done more generally by rescaling the highest weight A of some 
fixed irreducible representation of G to nd and again letting n > 9. 

The second case, where the limit N — 9 is typically the thermodynamic limit 
(namely if the density N/V is kept fixed, where V is the volume of the system sent to 
infinity, too), has been rigorously studied using operator algebras since the 1960s. In 
such work the system constructed at the limit N = © is typically quantum statistical 
mechanics in infinite volume, whose existence (followed by the establishment of 
e.g. phase transitions) was a major achievement of mathematical physics. 

However, our goal in taking the limit N — 9 is quite different, in that—in the 
spirit of Bohrification—our limiting system will be classical; from the traditional 
point of view we look at the macroscopic rather than the quasi-local observables. 
Nonetheless, for each finite value of N € N our (quantum) system will be the same 
as in the usual theory! Like the first case, in which increasingly large matrix alge- 
bras converge to an algebra of continuous functions on some compact space, this 
apparent miracle is described by the theory of continuous bundles of C*-algebras, 
as outlined in §C.19. As in the case # — 0 studied in the previous chapter, this theory 
provides a convenient mathematical machinery for studying the limit N — - also. 

We then apply the the limit N — o to N repeated experiments, and, applying the 
doctrine of classical concepts, rederive the Born rule (avoiding the conceptual and 
mathematical pitfalls of various previous attempts to do so). 

Bridging the gap to the next two chapters, we close with an introduction to quan- 
tum spin systems (as a later playing ground for spontaneous symmetry breaking). 
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8.1 Large quantum numbers 


As in 85.9, let G be a compact connected Lie group with Lie algebra g and dual 
g*, and let T C G be a maximal torus with Lie algebra t and dual t*. Let @, bea 
regular integral coadjoint orbit in g*, labeled by a dominant weight A € Ag. This 
means that there is a point 9 € @) whose stabilizer Gg is T, and A = 64; conversely, 
A © t* determines 6 € g*, which vanishes on each generator Ey of gc (@ € A). 
Following Theorems 5.49 and 5.51, we associate a unitary irreducible represen- 
tation u, : G— U(H,) to @ (or rather to 2), whose underlying Hilbert space Hy, 
contains a unique highest weight vector v,. We then have (5.228). We abbreviate 


dy, =dim(H,). (8.1) 


For SU(2) we have A € No/2 = {0,3,1,...}, usually called j, and the (regular) 
coadjoint orbits in g* = R? are the spheres S} with radius j (with j 4 0). The corre- 
sponding highest weight representation u; is carried by Hj with d; = 2j + 1, whose 
highest weight vector v; is an eigenvector of L3 = iu'(S3) with eigenvalue j. 

We are going to define a continuous bundle of C*-algebras over the base space 


T= (1/N)U{0}=1/N, (8.2) 


where N = {1,2,...} and N = NU {co}; as required, J contains 0 as an accumulation 
point. One may think of elements of J as “quantized” values of Planck’s constant 
h = 1/N, upon which the limit N — ¢ is formally the same as the limit # > 0. 

If A € Ag, then nA € Ag for all n € N. We may therefore define the C*-algebras 


Ao = C(@); (8.3) 
At /n = BCA, ). (8.4) 


For each f € C(@,) we define f, = x*f under the canonical projection z: G > 
G/Go = ©; (ie., fy (x) = f(2(x))), which enables us to define the operators 


Qrin(f)= dan fd fa(s)letaa 8)Pnn) ea 8)Pmal Arm —— BS) 
In fact, the entire integrand in (8.5) is a function on @,, because for z € T we have 


Und (xz) Var = Una (x) Uny (z) Vna = Xna (z) Und (x) Vnds 


and %,4,(z) € T cancels the factor ¥,,, (z) from the last term in (8.5). Note that 


Qi n(lo,) = ly; (8.6) 


as follows by taking W2 = y3 = UV, in Schur’s well-known orthogonality relations 


dna L AX (Wi Una (X) Wo) Una (x) Ws, Wa) = (Wi, Wa) (YW, Wo) (Wi Aya). (8-7) 
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Other properties of the maps Qj /, : C(@,) — B(H,,) (between C*-algebras) are: 


e Self-adjointness, i.c., Qi n(f)* = Qijn(f*)- 
e Positivity, i.e.,Q)/,(f) = 0 whenever f > 0. 


e Equivariance, i.e., writing Ly f(x) = f(y~'x) as usual, for any y € G we have 


O1/n(Lyf) =Uny (y) Qi nf una (y)*. (8.8) 


Positivity does not follows from self-adjointness, as Q) /, is not a homomorphism. 


Theorem 8.1. There exists a continuous bundle of C*-algebras A over I as defined 
in (8.2), with fibers (8.3) - (8.4), whose continuous sections are given by all se- 
quences (41 jn) nex © MnewAt/n for which ay € C(@,) and ay /, € B(H,,), and the 
sequence (a, /,)nen is asymptotically equivalent to (Qj jn(40))nen, in the sense that 


tim 141m — Q1 /n(40) | =0. (8.9) 

In particular, if f <¢ C(@ ), then the cross-section of [],,cx~A1/n defined by 
ag =f; (8.10) 
41 /n = Ainlf), (8.11) 


is continuous. In fact, we have a deformation quantization of @, in the sense of Defi- 
nition 7.1, where the Poisson structure of @, is inherited from (minus) the canonical 
one on the Poisson manifold g*, but we shall merely prove the claim of the theorem. 


Proof. This will follow from Proposition C.124, in whose notation A (which will 
actually coincide with A) consists of all @ = (@)nez where f runs through C(@,) in 


ayo = f; (8.12) 
Gi jn = Qijn(f)- (8.13) 


To verify the conditions for Proposition C.124 we start with the property that the set 
{Gn | @ € A} be dense in Ay; we will show that it even coincides with Ay. At h = 0 
this is true by construction. At i = 1/n, the required property 


O1 n(C(@r)) = BUAna ) (8.14) 


can be proved in two steps. For simplicity we set n = 1; the proof is the same for 
any n €N. The first step is to define a function L, on G for each a € B(H)) by 


La(x) = Tr (alu (x) Dg) (ug (x) Vg |) = (Vg, ua (x) "aut (x)V,). (8.15) 


This function is continuous and is right-invariant under 7, so that L, is really an 
element of C(@,). Thus we have a map L: B(H),) > C(@,), a> Lg. Furthermore, 


(a,Qi(f)) Hs = (La, f)2, (8.16) 
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where the Hilbert-Schmidt inner product on left-hand side is (a,b) 7s = Tr (a*b), 
cf. (B.495)—which is well defined since Hy is finite-dimensional—and the right- 
hand side is the inner product on L7(@, ) with respect to the measure induced by the 
subspace of L?(G,d, -dx) consisting of T-invariant functions. Now Q1 jn(C(@y)) is 
a (necessarily closed) linear subspace of B(Hy,), which coincides with B(Hy, ) iff its 
orthogonal complement in the Hilbert-Schmidt inner product is zero. 

Hence (8.14) is equivalent to the implication: a € (Q) /,(C(@,)))~ >a = 0. By 
(8.16), the antecedent holds iff (Lg, f)2 = 0 for each f € C(@,), which, because 
C(@,) is dense in L*(@,), holds iff Lg = 0. Hence the the above implication is 
equivalent to: Lg = 0 > a=0,ie., kerL = {0}. We must therefore prove the latter. 

If La(x) = 0 for all x € G, then, taking x = exp(t)A1)---exp(t,An), where each 
A; € g, and applying (5.156) for each ¢; to the right-hand side of (8.15), we obtain 


(Va, [uy (An),-+> [uy (Az), [uy (Ai), a] ---}o,) =0. (8.17) 


This equality extends to gc, so we may take Aj = Eg, for some positive root a; € A*. 
Since uw, (Eq)v, =0 for a € A, of each commutator [w, (Eq,),a] only the term 
u', (Eq,)a contributes. Moving the uw, (Eq,) to act as u4 (Eq,)* = u',(E—q,) on the 
vector on the left in the inner product in (8.17) gives all other eigenvectors of t, so 
that (8.17) implies (y,av,) = 0 for each y € Hy, and hence av, = 0. Now it is 
clear from (8.15) that Ly, (y)*au, (y) (*) = La(yx), So if La(x) = 0 for all x € G, then 
also Ly, (y)*au, (y) (*) = 0 for all x € G. Hence we may replace a by uy (y)*auy (y) in 
the above argument, finding uy (y)*au, (y)v, = 0 and hence au, (y)v, =0 for each 
y € G. Since uy is irreducible, this implies ay = 0 for any y € Hy, and hence a= 0. 
This completes the proof of (8.14). Proposition C.124 furthermore requires 


Jim ||Q1/n(F)I = If lls (8.18) 
This follows from the following key property (to be proved at the end): 
tim (una (Y) Dna ) Qin (f)Una (Y) Dna) = fy (y), (8.19) 


for any y € Gand f € C(@,). Indeed, for any y € G we obviously have 
21 nA) II 2 (Und (Y) Dna, Qijn(f)Una (Y) Dna): (8.20) 


Since G and hence @, is compact, by Weierstrass’s Theorem there is an y € G such 
that |, (v)| = || fll... Using this y in (8.20) and (8.19), the two of these imply 


lim inf ||Q1/n(F) Il > Ile (8.21) 


Conversely, for any unit vector y € H,, eqs. (8.5) and (8.7) imply 


(VQin(P)Y) = KW Qinl(AW) S lf lle- (8.22) 
If f is real-valued, then Qj /,(f)* = Qj /n(f*) = Q1/n(/). In that case, (8.22) implies 
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Qin FYI S IlFllee- (8.23) 


By the C*-identity ||a*a|| = ||a||, this is true for any f € C(@,). Therefore, 
lim sup ||Q1/n(f)|| < |lfll-.- (8.24) 
n—0o 
Eqs. (8.21) and (8.24) yield (8.18). It remains to prove (8.19), i.e., 
Timm doa ff (2) ena (9) Pantin (x)0,a) I? =f). (8.25) 


The key to the proof is the fact that if A and y: are dominant weights, with associated 
highest weight representations uy and u,,, respectively, for any x € G one has 
(Vq, Ua (X) DA) + (Vu Mu (X) Du) = (Vas Maru (*) Vat): (8.26) 


Namely, because the exponential map is surjective for compact connected Lie 
groups, eq. (8.26) is equivalent to the property 


(D,,Uy(A)Oa) + (Vy wy (A) Ou) = (VarueMapy(A)Parn), (8.27) 


for any A € g. For A € t this amounts toA +p =A +, cf. (5.228), whereas for 
A = Eg for some root @ € A we have 0 = 0, so that (8.27) is true for all A € g. This 
also proves (8.26), of which we need the special (and iterated) case 


(Dna sUna(X) Ung) = (Vg ,Uy (x) VA)”. (8.28) 
This motivates us to introduce a sequence (U,,) of probability measures on G by 
pln (x) = dng dx | (qua (x)d,)/"", (8.29) 


so that, after a change x +> yx of the integration variable, eq. (8.25) reads 


Tim dna f dbtn(#) fa) = ful) (8.30) 


for any f € C(@,). Now F(x) = |(0q,ua(x)v,)| takes values in (0, 1] and hence 
the measure (8.29) is du, (x) ~ exp(—nS(x)) for S(x) = —In(F (x)), with S > 0 and 
S(x) = 0 iff x € Ge, = T (using regularity of the orbit). In that case, ie., if z € T, 
then fy (yz) = f(a(yz)) = f(a(y)) = fa (y). The method of steepest descent shows 
that any part of G (of positive Haar measure) where S(x) > 0 makes no contribution 
as n —> co, so that we may replace fy (yx) in (8.30) by fy (y), obtaining 


n—-oo 


lim f duin(x) fa) = fa.) fim fdun(s) = fay) lim 1 =f). 31) 


We have now verified conditions | and 2 in Proposition C.124, and no. 3 is trivially 
satisfied since in condition | we have equality with Aj, as shown above. 
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8.2 Large systems 


We now move from large quantum numbers within a single system to large quantum 
systems that consist of N identical sites, where we eventually study what happens 
as N —> co (as is customary in quantum statistical mechanics we change notation 
from n € N to N EN). This limit gives rise to two different continuous bundles AW 
and A) of C*-algebras over J as given by (8.2), which have exactly the same fibers 
at 1/N but, amazingly, differ dramatically at N =, i.e., 1/N = 0. This difference 
reflects two choices one may make for the N-particle observables that have a limit 
as N — oo, namely local ones, giving rise to a highly non-commutative limit algebra 
A® (which is the one usually studied in quantum statistical mechanics of infinite 
systems), and macroscopic ones, which generate a commutative algebra A) of ob- 
servables of an infinite quantum system (describing classical thermodynamics as a 
limit of quantum statistical mechanics). It is the latter that we need for Bohrification. 

Let B be a fixed unital C*-algebra, describing a single quantum system. The 
case of a two-level system, where B = M2(C), is already fascinating, and many 
other interesting examples are described by finite-dimensional C*-algebras. Though 
irrelevant in finite dimension, we note that the constructions below are generally 
valid if (for technical reasons to be found in Proposition C.97) we use the projective 
tensor product ®max between C*-algebras; see §C.13. For any N € N we put 


(c) _ 4(@) _ pn 
Aviv =Ajiy = 8 : (8.32) 
i.e., the N-fold (projective) tensor product ON B of B with itself. Furthermore, 
A) = C(S(B)); (8.33) 
AD = B”, (8.34) 


where S(B) is the state space of B, seen as a compact convex set in the weak*- 
topology, as usual, and B® is the infinite (projective) tensor product of B with itself 
as described in §C.14; see especially (C.318) with C; = B for each i. For example, 
the state space of B = M>(C) is affinely homeomorphic to the unit ball in R*, whose 
boundary is the familiar Bloch sphere of qubits; see Proposition 2.9. 

We now explain how (8.32) and (8.33) - (8.34) give rise to continuous bundles 
A® and A® of C*-algebras, starting with the former. First, for each N € N, let Gy 
be the permutation group (i.e. symmetric group) on N objects, acting on B’ in the 
obvious way, i.e., by linear and continuous extension of 


a”) (by @--- @by) = By) @-- @byw); (8.35) 


where b; € B. This yields a symmetrization operator Sy : BN —+ BN defined by 


Sv=— Yap’. (8.36) 
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If B is infinite-dimensional, these maps can be extended by continuity to the com- 
pletion B® = ®,,,,,B of the algebraic tensor product ®*B; indeed, passing to any 
faithful representation of B it is easy to see that S‘ is even continuous with respect 
to the minimal cross-norm (cf. §C.13). For N > M we then define 


Su.w : BM — BN (8.37) 
by linear (and if necessary continuous) extension of 
Sun (41ym) = Sw(a1jm® 1p @---@ 1p) (aim € B™), (8.38) 


with N — M copies of the unit 1g € B so as to obtain an element of BY. Clearly, 
Sw,w = Sw. In particular, $1 : B BN gives the average of b over N copies of B: 


1 


N 
Siw(b) = = ¥ 1p@-+- @by) @ 1g-+-@ Ig,. (8.39) 


For example, take B = M,,(C) for simplicity, and pick some a = a* € Band dA € 
o(a), with associated spectral projection e,. Putting b = ey in (8.39) gives 


fy = Si n(eq). (8.40) 


This is a frequency operator: applied to states of the kind v; ®@---@ vy € (C")%, 
where each 0; is an eigenstate of a, so that av; = A;v; for some A; € o(a), the 
corresponding operator counts the relative frequency of A in the list (Aj,..., Ay). 
The commutative case B = C(X) provides a classical analogue. Eq. (C.271) gives 


BN =C(X)N =C(X%), (8.41) 


so that, identifying elements of BY with functions on X%, for f € C(X) we have 


N 
Sin(f)(1,---.4v) = = ¥ (fF (a1) + f(a))- (8.42) 


We return to the construction of a continuous bundle of C*-algebras with fibers 
(8.32) and (8.33). As in §8.1, we construct this bundle by specifying a preliminary 
family of continuous cross-sections and then using Proposition C.124 to finish. 


Definition 8.2. We say that a sequence (a,/n)Nen, with ayjy € BN, is symmetric 
when there exist M € N and ayy € B™ such that for each N > M one has 
a\/n = Su.n(41/m)- (8.43) 


This implies ay/y = Su(ai/m). Symmetric sequences can start in any finite way 
they like, but their infinite tails consist of averaged observables. Hence symmetric 
sequences asymptotically commute: if (a, ;x) and (b,/y) are symmetric, then 
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dim |larywbijy — Prat jvllax = 9, (8.44) 


simply because the commutators of single-site operators are nonvanishing only at 
finitely many positions, upon which the factor 1/N in (8.39) guarantees (8.44). 
For example, if B = M2(C), and (0;) are the Pauli matrices, we have 


A 
[Siw (4h01),S1,w(4ho2)] = i751 (shos), (8.45) 


et cetera, showing that the averaged spin-} operators effectively rescale h by h/N. 
In view of this, it is reasonable to expect that we may be able to assemble the 

algebra BY into a continuous bundle whose limit algebra at N = is commutative. 
For each symmetric sequence (a, /y) we define a function ao : S(B) + C by 


ao(@) = lim @ (ay;y), (8.46) 


N-+co 


where @ € S(B), and w” € S(B) is defined by linear (and continuous) extension of 
a’ (bj @---@by) = @(b1)--- @(by); (8.47) 


continuity of w” on the algebraic tensor product @‘B (and hence extendibility to 
Aj/n) is guaranteed by Proposition C.98, although this is not really needed here 
because ap only requires the values of wo on @B itself. In any case, the limit exists 
by definition of a symmetric sequence, from which we also see that ag € C(S(B)), 
because it is a finite sum of finite products of the type @(b1)---@(by), each of 
which is continuous in @ by definition of the w*-topology on S(B). 
For example, the frequency operators (8.40) define a symmetric sequence ( fe )NEN> 

whose the limit function te : S(B) — C in the sense of (C.560) or (8.46) is 


fo (@) = w(eg). (8.48) 


Thus (8.46) gives the Born probability for the outcome a = A in the state @; see 
§8.4. Classically, identifying elements of S(C(X)) with probability measures 1 on 
X, the limit of the sequence a/v = S1,n(f) for fixed f € C(X), cf. (8.42), is 


ao(u) = | aus. (8.49) 


This convergence is an example of the strong law of large numbers, see §8.3. 
We return to the general case. 


Definition 8.3. A sequence (a, /y)wen as above is quasi-symmetric if for each N € 
N one has a, jy = Sy (a, /y) and for any € > 0 there is a symmetric sequence (4, /y) 
and some M € N such that ||a, jy — || < € for all N > M. 


For example, if limy-..||@1/y — 4/y|| = 0 for some fixed symmetric sequence 
(ix), then (a1/)en is obviously quasi-symmetric. 
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Theorem 8.4. For any unital C*-algebra B, the C*-algebras (8.32) and (8.33), i.e., 


A) = C(S(B)); (8.50) 
(c) 
Avy = BN, (8.51) 


where BN is N-fold projective tensor power nr B are the fibers of a continuous 
bundle A‘) of C*-algebras over I = (1/N) U{0} = 1/N whose continuous cross- 
sections are the quasi-symmetric sequences (a, /n) with limit ag given by (8.46). 


As in Theorem 8.1, also here we have a deformation quantization of S(B) in the 
sense of Definition 7.1, where the Poisson bracket on S(B) may be defined by spec- 
ifying its value on linear function b € C(S(B)), where b € B and b(@) = @(b), by 


{a,b} = ifa,b). (8.52) 


Unfortunately, this involves the theory of infinite-dimensional Poisson manifolds, 
which we prefer to omit. Thus we shall only prove Theorem 8.4 as stated. 
The proof relies on Stgrmer’s quantum De Finetti Theorem 8.6 below. 


Definition 8.5. Let B be a unital C*-algebra. A state p on BN is called: 


e permutation-invariant if p o a’) = p for any p € Gy. 


e K-exchangeable (K € N) if it is permutation-invariant and in addition p is the 
restriction to BN of some permutation-invariant state on BN**, 
e Infinitely exchangeable if it is K-exchangeable for all K € N. 


The set of all permutation-invariant states / K-exchangeable states / infinitely ex- 
changeable states on BN is denoted by S°' (BN) ial (BN) / SSN (BN). 


Theorem 8.6. Let B be a unital C*-algebra. For any N € N the correspondence 
a > @, where w € S(B) and a € S(BN), cf. (8.47), gives a bijection 


deSON (BN) & S(B). (8.53) 


This theorem was originally stated (in the language of infinite tensor products) as 
Theorem 8.9 in §8.3, where it (and hence Theorem 8.6) will also be proved. 

We also need a formula for the norm of any self-adjoint element a of any C*- 
algebra A in terms of the state space A and the pure state space P(A), viz. 


lla|| = sup{|@(a)| : @ € S(A)} = sup{|@(@)|,@ € P(A)}. (8.54) 


This follows from Proposition C.15, the spectral radius formula (B.254), and com- 
pactness of o(a), implying that the supremum in (B.254) is reached on o(a). 


Proof. The proof of Theorem 8.4 is quite similar to the proof of Theorem 8.1, in 
that we once again rely on Proposition C.124, where the symmetric sequences are 
going to play the role of A. To apply Proposition C.124, we should prove that: 
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1. The set Ao (consisting of all a) € Ap = C(S(B)) as defined by (8.46), where (4; jy) 
runs through all symmetric sequences) is a *-algebra which is dense in Ao. 
2. For any symmetric sequence (G in) with limit do as given by (8.46), one has 


jim [lal = [aol (8.55) 


To prove the first claim, we first note that Go is the linear span of all finite products 
@(b;)- @(by), where N € N and bj,..., by € B. Since @(b) = w(b*) this is obvi- 
ously a *-algebra. The monomials b(@) = @(b) already separate points of S(B) C 
B*, since if @’ 4 @ then clearly is there some b € B for which (@ — @’)(b) 4 0. 
Hence claim no. | follows from the Stone—Weierstrass Theorem B.51. 

For the second, let (4 in) be a symmetric sequence. Since there are M € N and 


Gy jy € BM with Gy = Su(Gijm) and & 7m+x) = Su.m+K(G1/m) for all K EN, 


Ila rl] = sup{|P (ai )|: Pp € S(BM)} = sup{|p (a /)| sp € SO! (BY)}: 
Ila1+xyll = sup{|p (Su.w+x(@u))| : p € SOM (BMTE)} 
= sup{|p(a1;m)|: p € Se" (BY)}, 


where we used (8.54) and (8.43). Theorem 8.6 and (8.46) then yield (8.55): 


dim [a1 /x| = Jim [41 /(0+4) | 
= sup{|p (41/m)|: p € SO" (B™)} 
= sup{|p (4m)|: Pp € ASS" (BM)} = sup{|@™ (a ;)| : @ € S(B)} 
= sup{| Jim © (a,jy)] : @ € S(B)} = sup{|ao(o)| : @ € 5(B)} 


= ||Goll-e 


The proof that the sequences (a, IN) for which condition (C.552) in Proposition 
C.124 holds are precisely the approximately symmetric sequences is the same as the 
proof of the equivalence of the two conditions in Lemma C.125, taking fig = 0. 

Finally, it is easy to show that the limit (8.46) exists also for quasi-symmetric 
observables a: take € > 0 and find @ and M as in Definition 8.3. For this @, let Mo be 
such that (8.43) holds (with M ~» Mo). For all N,N’ greater than both M and Mo, 


|" (ayy) — ON (ayjnr)| < J)” (ary — 1 py) — O™ (a1 pw — G1 jw’)| 
+ }o" (ay) — 0 (yw) 
S |lai yw — @i pull + [lai yn — @ijnr)|| +0 
< 2€, (8.56) 


since ||@ || = 1. Hence (@ (ayy) is a Cauchy sequence (in C). 


Our second continuous bundle of C*-algebras of interest is described by the fol- 
lowing changes in Definitions 8.2 and 8.3. 


8.2 Large systems 303 


Definition 8.7. Let B be a unital C*-algebra and let ayy € BN for eachN EN. 


e A sequence (a,/y)Nen is called local when there exist M € N and ayjy € BM 
such that for each N > M one has 


a1/n =41/y@1lp®---@ lz, (8.57) 


with N —M copies of the unit 1g € B (so that indeed ay jy € BY). 
e A sequence (a,/y)nen is quasi-local if for any € > 0 there is a local sequence 
(Gy) and some M € N such that ||a, ;y — Gy|| < € for all N > M. 


For the right analogue of Theorem 8.4 we recall the description of the infinite tensor 
product B™; cf. §C.14, especially the explanation preceding (C.315). Accordingly, a 
dense subspace of B is given by equivalence classes of local sequences (a) /y)ven 


under the equivalence relation a ~ a’ iff limy 2. ||@1 ;y — 4 IN || = 0; the C*-algebraic 


operations in B® are inherited from the B’, and if we denote the equivalence class 
of (a1) by [ay/y|w, the norm in B® is given by 


IIa wlivll = jim jar yl. (8.58) 


By construction, this number is independent of the representative (a, /N)N in the 
class [a; in| n. By definition, B® is the completion of the space of these equivalence 
classes in the norm (8.58). As explained after (C.315), for each M € N we have an 
injective (and hence isometric) homomorphism @y : B” —> B® that maps a, jm BM 
to the equivalence class [a1 /y]v of the sequence (a/v) defined by 


ayy =0, (N <M); (8.59) 
a\/n =41/m, (N=M); (8.60) 
41 /(M+K) = 4;;y@1g8---@ lz, (K >0), (8.61) 


with K copies of 1g. It is easy to verify that one might as well have started from 
quasi-local sequences and their equivalence classes, for which the limit (8.58) exists 
by an argument similar to (8.56). In that case the ensuing C*-algebra is already 
complete, which leads to a direct description of the elements of B® as equivalence 
classes of quasi-local sequences. This fact also follows from the following analogue 
of Theorem 8.4, which may be proved in the same way, i.e., from Proposition C. 124, 
where this time the elements of A are local sequences rather than symmetric ones 
(in fact, the proof is much easier, since this time we obtain (C.552) for free): 


Theorem 8.8. For any unital C*-algebra B, the C*-algebras (8.32) and (8.34), i.e., 


A = B®; (8.62) 
(9g) _ pn 
A), = BN, (8.63) 


are the fibers of a continuous bundle A‘) of C*-algebras over I = 1/N whose con- 
tinuous cross-sections are the quasi-local sequences (a, ;y) with limit ao = |a,/y|N. 
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8.3 Quantum de Finetti Theorem 


As an initial step in exploring the connection between the bundles A® and AY 
we prove Theorem 8.6, which we first restate in an equivalent form. Let G.. be the 
group of bijections of N that differ from the identity only on a finite set. Each such 
finite permutation p € G... defines a map @, : B® —> B™, as follows. Let S C N be the 
finite subset of N on which p acts nontrivially (if S = @ we have p = idy, in which 
case also py = idg~, see below). Take a local sequence (a1 /N)N; so that (8.57) holds, 
in which we may assume M > maxS; we also redefine a,/y = 0 for each N < M. 
For each N > M > maxS, the map p may be regarded as an element py of Gy by 
restriction to {1,...,N} C N and hence p acts on B™ by permuting the entries in 
elementary tensor products of operators, cf. (8.35). For each p € Go, define a map 


A, : BY + BY; (8.64) 
N 
dtp([arjwlw) = [or (arjn)]w- (8.65) 
This uses a specific representative of the equivalence class [a, /n|N € B™, but 


nonetheless the map Q, is well defined. Furthermore, since each a) : BN — BN is 


an automorphism (i.e., an invertible homomorphism), it is an isometry, so that also 
Q, is an isometry on its domain and hence extends to an automorphism of B®. The 
ensuing map p> a, from G.. to the group Aut(B”) of all automorphisms of B® is 
a homomorphism of groups, and we say that G.. is an automorphism group of B™. 

Writing S°~(B®) for the set of all G..-invariant states on B®, i.e., p € SS» (B®) 
iff P © Mp, = p for each p € G.., we may now rephrase Theorem 8.6 as follows: 


Theorem 8.9. Let B be a unital C*-algebra. There is a bijection 
0.S°* (B®) = S(B), (8.66) 
given by @* + @, where @ € S(B), and w@” € S(B®) is defined by, cf. (8.47), 


o* ([a1jy|w) = lim @” (ayy). (8.67) 


This is essentially the same as Theorem 8.6: for any M €N, astate on B™ is infinitely 
exchangeable iff it is the restriction of an element of S°~(B*) to B“ C B®, where 
the inclusion is given by the map @y defined below (8.58). 


Proof, Let S(B) C S©~(B®) under the map @ ++ @*. We first show the inclusion 
O-S°* (B”) C S(B) (8.68) 
contrapositively, i.e., if p € S°~(B®) does not lie in S(B), then p has a nontrivial 


convex decomposition in S°~(B”). We identify BY with @y(B™) C B® and denote 
the restriction of p to BY by py. If p = @” for some w € S(B), then 


Pu+K (4 jy ® 4x) = Pa (4) y)PK (41K); (8.69) 
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for each a} jy, € B™ and Bi ie € BX. If (8.69) holds whenever 0 < ai iy < gm, then 
by Lemma C.53 and (C.8) it always holds. Adding suitable multiples of the unit and 
rescaling, it follows that if (8.69) holds whenever 


then it always holds. Therefore, if (8.69) fails, then it fails for some a‘ /M satisfying 


3: 
implies the existence of a nontrivial convex decomposition 


(8.70) and some and a /K> in which case 4 i< pPau(a ay) < 2, However, such a failure 


p=tp'+(1—r)p", (8.71) 


with t = pu (a), ju)» and the functionals p’ and p” on B® are defined by 


p'(larjwlw) = fim pan (ai pa ® 41/)/Pm (ai pm) (8.72) 
p"(laiyw]y) = im paw ((py — 44 jy) @41j)/Pm (lam — ai jy). (8.73) 


These limits exist on symmetric sequences (where they stabilize), and hence they 
exists in general. Furthermore, since Py(1pgu — a‘ im) = = 1-1, the property (8.71) 


is obvious. Both p’ and p” belong to S©~(B®), since each functional py+n is an 

element of SS”+ (BMW), Finally, (8.71) is nontrivial, since if p’ = p”, then pk = p7, 

and hence (8.69) would hold (whose violation we assumed). This proves (8.68). 
Though it is always true, for simplicity we prove the converse inclusion 


S(B) C a-S©* (B®) (8.74) 


just for the case where B is generated by projections, as in the case B = M,(C), 
B = B(H), or B a von Neumann algebra, or more generally an AW*-algebra (see 
§C.24). In that case also each BN is generated by its projections. 

For each p € S©~(B®), each N €N, and each projection e € BY, we have 


pn(e)” < pw(e®e), (8.75) 


see below. Assuming (8.75), suppose @ € S(B) and @” = tp’ + (1—1)p” for some 
t € (0,1) and p’,p” € SS~(B®). Since af = @”, we then have 


al(e)? = (pyle) +(1—phie))®=(( 2). (ee. 
<(( ee) ( a) (CPS) PSA) 


= thle)? + (Ino)? 
< tpjy(e@e) + (1 - 
Nie@e) =a (e 
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where the inner product in the first line is the usual one in R*, and, noting it is 
positive, we have used the Cauchy—Schwarz inequality for this inner product, as 
well as (8.75). Hence both inequalities must be equalities, and for the first one this 
implies p\(e) = px(e). Since this is true for all N and all projections in BY, this 
implies p’ = p” = w®, so that w* € 0,S°~(B®), and (8.74) has been established, 
up to the proof of (8.75). To this effect, note for each M € N andt € R we have 


pun((1pn @+:-@ lpn @e +++ +e@ lpn @-++@ lpn +t-1puv)*) (8.76) 
= M(M — 1)pon(e @e) +Mpy(e) + 2tMpn(e) +2, (8.77) 


with M — 1 copies of 1px and e moving from right to left in the first line, leaving 
M terms before the final one ¢- 1 zu in (8.76). In working out the square in (8.76) 
and moving to the second line we used e? = e as wel as permutation invariance of 
the state Pyy. The point is that (8.76) is positive, so that (8.77) must be positive, 
too, for all M € N andt € R. Nowa function f(t) =t? +2bt+¢ = (t+b)?-b* +e 
obviously satisfies f(t) > 0 for each t iff b* < c, so that (8.76) is positive for all ¢ iff 


M’ py(e)* <M(M — 1)pon(e @e) + Mpn(e). 


Letting M — o gives (8.75). 


Taking B = C(X) for some compact Hausdorff space X, in view of (8.41) the 
situation may be transferred to the Cartesian product X", equipped with the product 
topology (which is generated by products A, x --- x Ay C X% with each A; C X 
open) and the ensuing Borel o-algebra (generated by the above products with each 
A; Borel). If f1,...,iy are (probability) measures on X (in which case we write 
Li; € Pr(X)), then there is a unique (probability) measure j1) x --- x Ly whose value 
on a product as above is equal to [1(A1)---Hy(Ay). In particular, any probability 
measure LU € Pr(X) on X defines a probability measure uw on X%. 

The symmetric group Gy acts on X in the obvious way, and hence its acts on 
the power set Y(X"). We call the latter action 0%. ), so that for p € Gy we have 


of’) (Ay x +++ x Aw) = Api X + X Ayn) (8.78) 


The Cartesian product X° = XN is well defined both topologically and measure- 
theoretically (the topology is generated by all products [];A; with finitely many A; 
open and different from X, and likewise for the Borel structure), and the infinite 
symmetric group G.. = Uy Gy acts on it in the obvious way, in that p € Gy C Ga 
permutes the first N coordinates. Specializing Definition 8.5 to B = C(X), we obtain: 


Definition 8.10. A probability measure vy on XN is called: 


e permutation-invariant if Vy o of’) = Wy for any p € Oy. 


e K-exchangeable (K € N) if it is permutation-invariant and in addition Vy is the 
restriction to BN of some permutation-invariant probability measure on XN*K, 
e exchangeable if it is K-exchangeable for all K € N. 
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(N) 


A probability measure v.. on X® is called permutation-invariant if V..0 Op” = Veo 
for any p © Gy and N €N, where os) acts on |]; Aj by (8.78) on the first N factors 
Aj,...,An whilst acting trivally on all remaining Aj’s. 


The connection between the two parts of this definition is that Vy is exchangeable 
iff it is the restriction to X% of some permutation-invariant measure v.. on X®. 
From Theorems 8.6 and 8.3 we obtain the Hewitt-Savage Theorem: 


Corollary 8.11. Let X be a compact Hausdorff space. For any N €N, any infinitely 
exchangeable probability measure vy on X% takes the form 


Vv = dP(u) pw (8.79) 
Pr(X) 


for some probability measure P on Pr(X) that is uniquely determined by vy, and 
similarly for N = ©, where V.. is a permutation-invariant probability measure. 


The two claims in the theorem are equivalent by the remark after Definition 8.10. 


The probability measure P € Pr(Pr(X)) has the following interpretation. For N € 
( peer xy) 


N and (x1,...,xv) € X%, define the so-called empirical measure Ex “on X as 
(4y) le 
EC LeeXN) a = Ox;, (8.80) 


(51.89) . 
[ange f= nae Ls). (8.81) 


Given a probability measure vy on X,, these formulae give a random probability 
measure on X depending on a drawing from X%, i.e., a map 


Ey :X™ —> Pr(X); (8.82) 
(ir ocey ine (8.83) 


Proposition 8.12. The probability measure P in Corollary 8.11 is given by 


lim). ahyP= | dP F, (8.84) 
N-co JPr(X) Pr(X) 


for each F € C(Pr(X)) (that is, P = limy_+.0Py weakly), where Py € Pr(Pr(X)) is 
the probability measure on Pr(X) defined by vy € Pr(X") and (8.82) - (8.83), i.e., 


Py(A) = vy(Ey!(A)) (A C Pr(X)). (8.85) 


Proof. By the Stone—Weierstrass Theorem it suffices to prove (8.84) for linear com- 
binations of monomials like F(u) = u(fi)--- (fx), where fi,..., fx € C(X) are 
arbitrary and u(f) = fy du f. This is a simple computation: using (8.85), we have 
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7 dPy F = | dVy(x1,--.,4n) F (EGY) 
Pr(X) XN 


K 12x 
= [dv lets) I] (i ¥ fits) 


; of K 1 N 
lim wg [aH (x1,---,4N) II Gi ¥ fits) = 


N- J Pr j=l 


Fo APH) [amend sfiler)- [datow) flew) =f. dP 


We can also say more about the limit of the sum (8.81), So far, we have been 
dealing with the Borel o-algebras By C A(X) and Z. C A(X®) generated by 
the topology (i.e., by the open sets). On top of this, consider .“y C By, defined 
as the o-algebra generated by the permutation-invariant Borel subsets of X, or, 
equivalently, as the smallest o-algebra for which the permutation-invariant Borel 
measurable functions on X are measurable. Likewise, .Y%. C Bw; regarding A C 
X™ as a subset A x [xs X of X*, we have .%. = Anen-“v. For any permutation- 
invariant probability measure vy on X N the Hilbert space LV? (X,Y, Vy) is aclosed 
subspace of VL (X a , 4n,Vn), and the associated conditional expectation 


EvAy.vy) 1? (X", Bn, Vu) 4 L?(X,-Au, Vw) (8.86) 


is defined as the corresponding orthogonal projection. Since C(X") C L?(X%), this 
map restricts to C(X). Similarly for N = 0. For each N €N, and also for N =, 
we may regard f € C(X) as a function fx on X™ through 


fx (%1,---.%n) = f (xx) K=1,...,N. (8.87) 


Proposition 8.13. Let v.. be a permutation-invariant probability measure on X™, 
with restriction Vy to X™. Recall (8.42). For any f € C(X) we have pointwise: 


Sin(f) = E~y,ww) (f1), Vn-almost surely; (8.88) 
Jim Sin(f) = E(.,ve) (fi), Voo-almost surely, (8.89) 
—S00 


where the left-hand sides of (8.88) and (8.89) are functions on X™ and X®, respec- 
tively. Furthermore, if Vo = U™ for some LL € Pr(X), then pointwise on X®, 


lim Si v(f) = i duf, “”-almost surely (f € C(X)). (8.90) 
N-yoo xX 
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Equivalently, if L, C X~ is the set of infinite sequences (x;,x2,...) in X* for which 
the limit in (8.90) exists for each f € C(X) and equals fy du f, then 


u(Ly) = 1. (8.91) 


Proof. Eq. (8.88) is almost trivial, since S;,(f) is permutation invariant and hence 
already lies in L?(X,.Y%y, Vy), so that the equality just expresses the projection prop- 
erty El Aywy) = E. Avy): Eq. (8.89) follows from the ergodic theorem, applied to 
the probability space (X°, Boo, Vo), the unilateral shift 


T: (x1,X2,---) ad (x2,%3,---), 


and the random variable f; defined by f € C(X) via (8.87). Since Vo. is permutation 
invariant, it is also T-invariant (in the sense that v..(T~!(A)) = Voo(A) for any A C 
B..). This follows either directly, where one has to realize firstly that 


T—!(Ay x Ag X +++An X01) =X XK Ay XAQ XK XA Xe, 


and secondly that Z&.. is generated by products J]; A; with finitely many A; different 
from X, or, more easily, from Corollary 8.11. The (pointwise) ergodic theorem gives 


Jim Si.w(f) = E(Bp,ve)(fi); Voo-almost surely (f € C(X)), (8.92) 
where &r is the o-algebra within &.. by the T-invariant sets, and fj € C(X~) is still 
defined by (8.87). Since .% C Bs and the left-hand side of (8.89) is .~.-measurable 
(provided it exists, as we have just shown), eq. (8.89) follows from (8.92). 

If Vo = UW”, then the unilateral shift on X is ergodic by Kolmogorov’s 0-1 law, 
and hence the ergodic theorem gives (8.90). Alternatively, if v. = wu, then the 
random variables (fy), defined by (8.87) with N = ©, are i.i.d. (i.e., independent 
and identically distributed) and (8.90) follows from the strong law of large numbers 
(which, coherently, in turn may be derived from the ergodic theorem!). 


Note that (8.92) has been proved for f € C(X), but it holds for many other func- 
tions, including f = 14, where A € &. This gives Borel’s law of large numbers 


Jim Sinw(,) = H(A), “?-almost surely. (8.93) 
00” 


For example, take X = {0,1} (e.g., a coin toss with outcomes | = heads and 0 = 
tails). With f(x) =x in (8.90) or A = {1} in (8.93), writing p = w({1}), we obtain 


1x 
lim N Le =p, w-almost surely on 2, (8.94) 


N-+00 


Equivalently, if Ly C 2N is the set of infinite binary sequences x1x2--- for which the 
limit in (8.94) exists and equals p, then *(L,) = 1, cf. (8.91). 
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8.4 Frequency interpretation of probability and Born rule 


Results like (8.90), (8.93), and (8.94) give a relationship between the single-case 
probabilities (A) or p and the limits of long series of trials on samples drawn ac- 
cording to w or p. Despite the seemingly comforting appearance of N < © on the 
left-hand side, this relationship depends in an essential way on the infinite idealiza- 
tion X°, which is strictly necessary in order to be able to say that the limit (8.94) 
holds almost surely relative to the measure u~. This violates Earman’s Principle (cf. 
the Introduction), which is the reason why we prefer the limit (8.49) over (8.93). 
Although these results are mathematically equivalent, both formalizing the idea 
that if (x,,...,x,) are sampled from X according to some probability measure i, 
then (1/N)y™_, f(x) converges to fy du f as N — 9, in (8.49) we never need to 
work with the “actual infinity” N = o and (8.49) holds everywhere on Pr(X) rather 
than almost everywhere on X. One reason for this is that in (8.93) etc. the choice of 
the sampling measure LU has to be made at the beginning, whereas in (8.49) it only 
comes in at the very end. But it has to made either way, and similarly for any other 
serious effort to relate probability to frequencies in long runs of measurements. 
The extreme delicacy of such efforts is clear from the fact that limiting results 
like (8.90), (8.93), and (8.94) are insensitive to any finite part of the sum, whereas 
any practical use of probability only involves finite trials. As Lord Keynes once said: 


‘In the long run we are all dead.’ 


The founder of the mathematical theory of probability expressed himself likewise: 


‘The frequency concept based on the notion of limiting frequency as the number of trials 
increased to infinity, does not contribute anything to substantiate the applicability of the 
results of probability theory to real practical problems where we have always to deal with a 
finite number of trials.’ (Kolmogorov). 


Moreover, a definition of probability based on e.g. (8.93) is well known to be cir- 
cular: although superficially the “almost sure” terminology in the statement of the 
result might instill confidence in the reader, in fact it is an exceptionally strong con- 
straint on the sequences (x,,) € X* in question that the limit should exist and has the 
right value p1(A), i.e., that (x) € Ly, cf. (8.91), and we see that this constraint can 
only be formulated if the single-case probability 4 was already defined in the first 
place. This shows that the link between probability and frequencies of outcomes of 
long runs of trials only exists and makes sense if single-case probabilities are prior. 

On the other hand, if single-case probabilities are “objective”, as those provided 
by the Born measure in quantum mechanics ought to be at least in remotely realistic 
interpretations of the theory (as opposed to “personal” or “subjective” probabili- 
ties construed as “degrees of belief” or “rationality constraints” or whatever other 
decision-theoretic concept in human psychology), then it is hard to say what they 
really mean, since it is precisely about single cases that they do not seem to say 
anything. This brings us to what we propose to call the Paradox of Probability: 
Although single-case probabilities must be logically prior to probabilities construed 
as frequencies, the numerical values of the former have no bearing on single trials 
and can only be validated through their predictions about (finite) frequencies. 
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This paradox imposes the following consistency requirement (which philosophers 
may want to compare with Lewis’s “Principal Principle” that regulates credences): 


The assumption that a single-case probability measure be Ut must imply that the 
probabilities for the various outcomes of long runs of repetitions of identical exper- 
iments (provided these are possible) are distributed according to UL. 


This describes the relationship between theoretical and experimental physics quite 
well, but still leaves us in the dark as to the meaning of single-case probabilities! 

We are now ready to revisit the Born rule, which we already discussed from 
a purely mathematical point of view in §§§2.1, 2.5, and 4.1. To repeat the main 
point, if a = a* € B(H) is a bounded self-adjoint operator on a Hilbert space, with 
spectrum o(a), then any state @ on B(H) defines a unique probability measure Uw 
on O(a) CR, called the Born measure, such that 


w(f(a)) = [ pillel: FE C(0(a), (8.95) 


where f(a) € C*(a) C B(A) is defined through the continuous functional calculus 
(Theorem 4.3). For example, for f = id(q), i.¢., the function x ++ x, eq. (8.95) yields 


(a) i pitiol)A. (8.96) 


The point of this construction of the Born measure is that it is obtained by simply 
restricting the state @, initially defined on B(H), to its commutative C*-subalgebra 
C* (a). If, in the spirit of (exact) Bohrification, such commutative algebras are iden- 
tified with corners of classical physics within quantum theory, one may argue that 
Heisenberg gave the right picture of the origin of probability in quantum mechanics: 


‘One may call these uncertainties objective, in that they are simply a consequence of the 
fact that we describe the experiment in terms of classical physics; they do not depend in 
detail on the observer. One may call them subjective, in that they reflect our incomplete 
knowledge of the world.’ (Heisenberg, 1958, pp. 53-54) 


See, however, §11.1. In any case, there are extensions of this construction to un- 
bounded self-adjoint operators as well as to families of commuting self-adjoint op- 
erators, to which the following discussion applies, too, mutatis mutandis. 

The Born rule relates the Born measure for a to measurements of a and as such 
is responsible for most predictions of quantum physics, especially in quantum field 
theory, where the connection between theory and experiment mainly involves the 
measurement of cross-sections computed from the Born measure via Feynman rules. 
The Born rule and the Heisenberg uncertainty relations are often seen as a turning 
point where indeterminism entered fundamental physics. Nonetheless, it is hard to 
say what this Born rule actually states! We made a first attempt in §4.1: 


If an observable a is measured in a state @, then the probability Py(a € A) that the 
outcome lies in some measurable subset A C (a) C R is given by 


Po(a € A) = Mo(A). (8.97) 
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Two questions immediately arise: 


1. What is meant by a “measurement” of a (and by its “outcome’’)? 
2. What does the “probability” Po (a € A) mean? 


Perhaps these are even the main questions in the foundations of quantum mechanics. 
The first will be taken up in Chapter 11; for now, we simply assume that measure- 
ments of quantum-mechanical observables a are defined and have outcomes in o(a). 
The second has just been answered (or some might say evaded): through the Born 
measure, the formalism of quantum mechanics provides numerical values of U@(A), 
whose mathematical meaning seems unquestionable, and whose operational mean- 
ing is given by the predictions they give for outcomes of long runs of repetitions of 
identical experiments. Therefore, all that remains to be done is derive these predic- 
tions by analogy with the results in §8.3 for the commutative C*-algebra C(X). 
One such attempt is—in its strengths and its weaknesses—quite analogous to the 
Borel’s law of large numbers (8.93). Although we will soon move to B = B(#), the 
following result is valid for any unital C*-algebra B, with infinite tensor product B™ 
as defined in §C.14 and recalled at the end of §8.2, including the map @y : BY” > B®. 


Proposition 8.14. If @ € S(B), there is a unique state @° on B® such that 


M 
o” (gu (bi ®-:-@bu)) =[] (bn), MEN, by,...,bu €B. (8.98) 


3 
ll 
_ 


Moreover, ®” is pure iff @ is pure. 


This is a special case of Proposition C.105, with C; = B and @; = @ for all ic N. 
We now take B = B(H) for some separable Hilbert space H, some observable 
a =a* € B(H#) with spectrum o(a) C R, and some unit vector v € H, with asso- 
ciated (normal) pure state @y in B(H) defined by @»(b) = (v, bv), and Born mea- 
sure LL, = Ly on o(a). Now take the corresponding pure state @F on B(H)* and 
construct the associated GNS-representation 1-(B(H)*). The Hilbert space Hy» 
carrying this representation is an example of an infinite tensor product of Hilbert 
Spaces in the sense of von Neumann, which may also be defined directly, as follows. 
Take sequences (Y,,) = (Wi, W2,...) with y,, € H satisfying the condition 


Y" || Wall — 1] <2; (8.99) 


the rationale behind this condition is that for any sequence (z,,) of complex numbers, 
the product [],, Z, converges and has a nonzero limit iff ¥,,, |z, — 1] < 29, so (8.99) is 
equivalent to the requirement that JT,, || W,,|| converges to some nonzero value. Fol- 
lowing von Neumann, we now introduce the convention that if, for some sequence 
(Zn) of complex numbers, [],, |Z,| converges but J], Z, does not, we define the latter 
to be zero. On this convention, linear and continuous extension of the expression 


(Wn), (Wh) = [Uns Vida, (8.100) 


n 
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defines an inner product on the finite linear span Ho’ of all sequences (¥;,) satisfy- 
ing (8.99); the complete tensor product H™ is defined as the closure of Hj in the 
ensuing norm. However, this is not the Hilbert space of interest, since it is far too 
large (e.g., it is not separable even if H is). To define interesting separable subspaces 
of H®, we call sequences (y;,) and (y;,) that both satisfy (8.99) equivalent if 


Y\(Wns Va) — 1] <0; (8.101) 


this turns out to be a bona fide equivalence relation. In particular, if (y;,) and (yi) 
are inequivalent, then ((W,),(y,)) = 0. For any unit vector v € H, we now define 
the incomplete tensor product H; as the closure of the linear span of all sequences 
(Y,,) that satisfy (8.99) and are equivalent to v® (i.e., the sequence (y/,) with y/, = v 
for each n), with inner product borrowed from H® (note that von Neumann’s termi- 
nology “incomplete” is somewhat confusing, since HZ is complete as a normed 
vector space and in particular it is a Hilbert space). By construction, uv” € Hj, and 
it is easy to show that H> is the closed linear span of all sequences (y,,) that differ 
from v € H in at most finitely many places. We often write @,W, or Wi ® Wo @--- 
for (Y,). Furthermore, for any M € N, any b € B(H) defines a bounded operator 


M 3 F ; P 
b\ ) on Hy by continuous linear extension of 


bY (Wi @W2Q- @Wu@-)=Wi@W@--@byyw@-. (8.102) 


This extends to a representation 7, of B~ on H,7, as follows. Define b™) © B® by 


b”) = oy(1y ®---@ 147 @b), (8.103) 


in which ly @---@1y @b © B”, and Qu : B™ —, B® was defined after (8.58). In 
other words, for b € B(H), the operator b™) is the element of B® given by the 
equivalence class [a1/y] of the sequence (a1/y)n with 1g in every place except 
a1/y = 5. We then define 7° (B®) by linear and continuous extension of 


mee(BEM) ... BM) = ph)... pn) (8.104) 


~~ 1v 


Proposition 8.15. For any unit vector v € H, the GNS-representation Ng=(B”) on 
Hy is unitarily equivalent with 1; (B”) on H;, under which equivalence the cyclic 
vector Qge © Hoe corresponds with vo” € Hy. 


Proof. This is a simple consequence of Proposition C.91 and the equality 


Oy (a) = (V™, Ty (a)O™) Hg, (8.105) 
initially for a = b™), subsequently for a = pit) . pan ) and finally, by linearity 


and continuity, for any a € B®. 


In view of this, we will henceforth identify the two Hilbert spaces etc., so that: 
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Hees He: (8.106) 
Tog (b™) = BM; (8.107) 
Que = 0". (8.108) 


Recall that A(H) is the set of all projections on H, seen as a lattice ordered by 
e < f iff ef =e, which is equivalent to eH C fH, and coincides with the order 
in B(H)sa, cf. Proposition C.170. Also, & is the Boolean lattice of Borel subsets 
of o(a), ordered by inclusion. For each Borel set A C o(A) we have an associated 
spectral projection e, € Y(H), and the map A +> eg defined by the Borel functional 
calculus, i.e., Theorem B.102, is a lattice homomorphism from ¥ to Y(H). This 
follows because from the perspective of the Borel functional calculus the map A > 
eg is really the map 1,4 +> eg, which is the restriction of a homomorphism between 
C*-algebras and hence preserves positivity. Let 4” be the Boolean lattice of Borel 
sets BZ” in o(a)”. As above, take some unit vector v € H, with corresponding 
vector state @) on B(H) and associated state @F on B(H)* as defined in Proposition 
8.14, which in turn defines the GNS-representation 1 of B(H)* on the Hilbert 
space Hy. The lattice homomorphism A ++ e, then extends to a homomorphism 


ce”: B° + P (Hue); (8.109) 
Ar x= x Am x [] o(a) 9 tog (ele); (8.110) 
M+1 


this defines e* on the basis Borel sets in o(a)® and extends to all of B®. Realizing 
Hz as the infinite tensor product H77, cf. (8.106) - (8.108), we rewrite this as 


oh (« xx Aux T] a) Se) es (8.111) 
M+1 


Theorem 8.16. Let a = a* € B(H), let Uy be the Born measure on 6(a) defined by 
some unit vector v € H, and define e® by (8.111). Let o(a)% be the set of all points 
in o(a)® for which (8.92), or, equivalently, (8.93) holds (with U ~> Ly). Then 


e*(o(a)3) = LH ge (8.112) 
Furthermore, if A © o(a) is Borel measurable, then, using the notation (8.39), 
lim S1 (ea) = Hy (A)- liye 5 (8.113) 
N- 00 Dv 


in the strong operator topology (i.e., applied to each fixed vector in Hw). 


This is the quantum-mechanical law of strong numbers, plus its Borel version. In 
comparison, the strong law of large numbers or Borel’s law of large numbers gives 


My (o(a)y) = 1. (8.114) 
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Proof. For any probability measure ft on any o-finite compact space X, the corre- 
sponding probability measure u~ on X* is characterized by the property 


Un | Ay x-+-x Au x Ax [T] o(a) ) =u (Ayu {Ar x---x Au x T] o(@) }, 
M42 M+1 
for any M € N and Borel sets A; C X. The measure v on o(a)” defined by 
v (1 x xAu x T] a(o)) = 0% (et) 7 ey) (8.115) 
M+1 


satisfies the above property for W = [ly and hence coincides with Uy. In view of this, 
eqs. (C.196) and (8.114) give 


(Qoz,e”(0(a)3) Qez) = 1. (8.116) 


For any projection e’ and any unit vector y’ € H’ in any Hilbert space H’, the prop- 
erties (w’,e’w’) = 1, |le’w’|| = 1, and ew’ = w’ are equivalent. Therefore, 


e” (0(a)p ) Qaz = Qoz. (8.117) 


Consider a vector ®,W, € Hj, where only W1,..., Wx possibly differ from v (K < 
co), Noting that by (8.106) - (8.107) the right-hand side of (8.115) may be written as 


oy ( oe) = (Que Tas (e+e) Qos) 


(v®, (ce), @--@el yor), (8.118) 


we modify (8.115) so as to define a new measure v’ on o(a)” by 


: (« xx Aux T] o()) = (BnVns (€hi'p Begg y) Gn Vr): 
M+1 


Generalizing the above case of , the measure v" = My, X +++ X Myg X Tey Ho 
on oO” is characterized by the following two properties: 


vy" (« x xX Axx [] o()) = Hy, (A1) +++ Hyg (Ax); (8.119) 


v" (1 XxX Ay XA Xx Il o()) = [y(A)v” (1 Meee Ayer Il a) , 


M+1 
(M > K), (8.120) 


and hence v’ = v”. Therefore, even though v’ # uv, we have v'(o(a)$) = 1, since 
membership of o(a)> is entirely defined by the tail of the event. Hence we obtain 
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e°(0(a)5) @n Va = @nVns (8.121) 


by the same reasoning as for b™ = Qe». Since the linear span of such vectors is 
dense in Hj = Hw» and the projection e*(o(a)%) is bounded, we obtain (8.112). 
To derive (8.113), we use the definition of the Born measure [Hy to find 


U(Siwv(ea)—Ho(A))O™I =F (ato(A)—2(A)?), 8.122) 


which vanishes as N — 9, so that (8.113) holds on v®™. A similar computation proves 
(8.113) on vectors @,W;, as above, since the initial K terms where possibly y;,, 4 D 
drop out in the limit N — oo. Thus we have (8.113) on a dense subspace of Ho. 
Since the strong limit operator (A) - 1 Hoge is bounded, this proves (8.113). 


An alternative argument shows the mere existence of the limit on the left-hand side 
of (8.113) on the same dense set, upon which the limit operator is seen to commute 
with all local and hence (by norm-continuity) with all quasi-local operators. Since 
@y is pure, so is @;, and hence Z@~ is irreducible. Thus the limit is a multiple of 
the unit, and the coefficient Uy (A) then follows from the computation 


Jim (0, Siw(ea)0™) = [y(A). (8.123) 


To reduce the level of abstraction and since it is an important case, we now spe- 
cialize Theorem 8.16 to a two-level system, i.e., B = M2(C). In other words, we take 
H =C’, and pick a simple observable a = diag(1,0) with non-degenerate spectrum 
o(a) =2 = {0,1}, so that measurements outcomes are just strings of zero’s and 
one’s. Furthermore, we take a unit vector v = co|0) +ci|1), where |0) = (1,0) 
and |1) = (0,1) form the standard basis of C?, and |co|* + |c;|? = 1. We write 
p =|ci|. The Born measure Ly on o(a) = {0,1} is then given by Uy({1}) = p 
and [ly ({0}) = 1 — p; cf. (2.10) - (2.11). Taking A = {1}, we have e4 = |1)(1]. The 
Hilbert space (C”)* is the closure of the finite linear span of vectors of the kind 
Wi ® Wo: with y, € C? and only finitely many y, possibly different from v. For 
M €N, the operator|1) (1|()) sends such a vector to Wi @ Wo --@ (|1) (1 War) @-**, 
with all y,, unaffected except for n = M. Eqs. (8.112) - (8.113) then simply read 


e° (27) = licays (8.124) 
lege (M)) _ 
Pg 2 )=p- les, (8.125) 


where 25 denotes the set of all infinite binary strings x,x2--- for which x; € 2 and 


a1 
He a =p, (8.126) 


and once again the limit in (8.125) is meant strongly, i.e., the expression on the 
left-hand side must be applied to a fixed vector in (C*)®. 
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Theorem 8.16 forms the (mathematical) culmination of attempts that started in 
1960s to derive the Born rule from other postulates of quantum mechanics, no- 
tably the so-called eigenvalue-eigenvector link, according to which a quantum- 
mechanical observable has a definite value if and only if the current quantum state is 
an eigenvector of the associated operator. This link is applied to the state v™ (or to 
any other state with approximately the same tail) and the operators e*(o(a)}) and 
limy_3..5 (ea). The idea, then, is that according to (8.112), the property expressed 
by the projection e*(o(a)%) is certain in the state v® (for qubits this means that any 
possible infinite string of binary measurement outcomes has average value p). This 
is reinforced by (8.113), which states that the frequency operator for the outcome A 
has a sharp limit equal to u(A) (for qubits, with A = {1} this limit is p). 

However, although the mathematics is suggestive, apart from the fact that the 
eigenvalue-eigenvector link itself falls prey to Earman’s Principle (in that sharp 
eigenvalues and eigenvectors are an idealization in a world full of continuous spec- 
tra), this particular application of the link makes sense only at N = ». In this re- 
spect, eq. (8.124) has the same drawback as the strong law of large numbers (on 
which its derivation indeed relies), including the fact that attempts to define proba- 
bilities through (8.113) or its special case (8.125) are inherently circular. Moreover, 
v™ fails to be an eigenvector of any finite-N approximant to (8.125), and by the 
same token, the limit operator defined by (8.125) can only be measured via its in- 
dividual contributions |1)(1|), none of which has v® as an eigenvector; in fact, it 
can be shown that any joint eigenvector of all projections |1)(1|) is orthogonal to 
the entire space (C*)® with the complete infinite tensor product (C)*. 

Problems with Earman’s Principle are avoided if we use Theorem 8.4 (applied 
to B = B(#)) rather than Theorem 8.16: the sequence of operators S| y(e4) forms 
a continuous section of the continuous bundle of C*-algebras with fibers (8.50) - 
(8.51), whose limit at N = ©, in the sense of (8.46) or (C.560), is given by 


Si.0(€4) : OH @(A); (8.127) 


recall that S;.(e4) € C(S(B(H))). In particular, for pure states @ = @, we obtain 
the Born probability ty (A). As we have also seen in the commutative case, this limit 
avoids infinite idealizations and other problems with the law of large numbers. 

From the point of view of (asymptotic) Bohrification, C(S(B(H))) provides a 
classical description of a long run of identical experiments, which becomes increas- 
ingly accurate as N — o9; this is the whole point of the limits (8.46) and (C.560). In 
particular, the unsound eigenvalue-eigenvector link has been replaced by the role of 
points @ € S(B(H)) as truthmakers, which is uncontroversial in classical physics. 
If the quantum state in each identical experiment on the given (single) system is @, 
then the above derivation shows that in the limit N — , this state acquires a clas- 
sical meaning (which according to Bohr would even be the only meaning it has), 
namely as the point in the “classical phase space” S(B(H)) that gives the relative 
frequencies of outcomes of the given long runs of identical experiments. Short of 
deriving the Born rule, this at least provides the reasoning that links the Born mea- 
sure (which is canonically given by the theory) to experiment. 
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8.5 Quantum spin systems: Quasi-local C*-algebras 


Beside the Born rule, our second application of the previous formalism is to quan- 
tum spin systems, especially to spontaneous symmetry breaking (SSB), see Chapter 
10. Postponing a conceptual discussion of infinite systems in their role of idealiza- 
tions of finite systems to the preamble of that chapter, for the moment we just de- 
scribe infinite quantum spin systems mathematically. As in §C.14, we take a Hilbert 
space H, here assumed finite-dimensional, i.e., H = C”, and use the standard lattice 
Z4 C R¢ in dimension d. For any finite subset A C Z4, i.e., A € Pp(Z4), we put 


Ay = ®xca Fy; (8.128) 
A, = B(H,) = @xeaB(Ay), (8.129) 
where H, = H for each x € A, cf. (C.297) and (C.303). The symbolic notations 
II-l 


A= @yeq¢B(H) =limyAn = (J Aa, (8.130) 
ACP; (ZA) 


all come down to the same thing—see §C.14, notably (C.323) and (C.317)—and 

define a quasi-local C*-algebra. Elements of each A, C A are called local observ- 

ables, those in the closure of their union are referred to as quasi-local observables. 
Eq. (8.129) defines a map A ++ A,, which has three important properties: 


A,ay CA,gg) if AM CA) (Isotony); (8.131) 
[A,a),A,] =0 if AY) NA? =0 (Einstein locality); (8.132) 
Al, = A,’ (Haag duality), (8.133) 


where A‘, in (8.133) is the commutant of A, within A, and, in cute notation, we put 
A'=Z4 \A (which is infinite), so that the right-hand side of (8.133) denotes 


I-I 
? 


Aq = @,ea'B(H) = U Aq) (8.134) 


AVY EP/(Z4\A) 


which is a C*-subalgebra of A. Since A?) c Z4\A“) whenever A“) NA? = 0, 
Haag duality implies Einstein locality (and sharpens it), but it is still worth men- 
tioning these properties separately: although in quantum spin systems (8.133)—and 
hence (8.132)—holds, Einstein locality is a more fundamental property (e.g. it is 
also valid in algebraic quantum field theory, where Haag duality may well fail). 

We now discuss some C*-algebraic concepts that will be needed for the analysis 
of SSB. Through the associated GNS-representation 1 : A > B(Hq), any state @ on 
A defines two interesting subalgebras of B(Hq), which a priori may be different: 


e The center At, = TMo(A)" 1 Ho (A); 
e The algebra at infinity Aj = Nac a;(z4) Tw(Aa')”. 
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Recall that the center of a von Neumann algebra M C B(H) is MMM", and that M is 
called a factor if MMM =C-1 (cf. §C.21), so Af, is the center of the von Neumann 
algebra %(A)”. It is easy to show from Einstein locality that A C AG. If each local 
algebra A, is simple, Haag duality yields the opposite inclusion, so in that case, 


Ae =AS,. (8.135) 


Given (8.129), this applies as long as dim(H) < ©, in which case also A is simple. 
The algebra at infinity provides a new perspective on the macroscopic observ- 
ables in §8.2. Averages like |A|~!Y.<, b(x), where b € B(H), do not have a limit 
in A as A t Z4, but (depending on @) their representatives |A|~! Yc, Z@(b(x)) 
may have a weak limit in B(H.). If they do, Einstein locality implies that the limit 
operator lies in algebra at infinity Aj, (and hence, assuming (8.135), in Aj). If the 
algebra of infinity is trivial (i.e. C - 14,,), macroscopic observables are therefore “c- 
numbers”, i.e., multiples of the unit operator. In particular, they do not fluctuate, 
which is among the defining properties of pure thermodynamic phases. Formally, 
this idea is captured by the following generalization of the notion of a pure state: 


Definition 8.17. A representation m(A) is primary if 2(A)” N2(A)’ is trivial. 
A state @ € S(A) is primary if the GNS-representation Tw is primary. 


For compact groups G (or rather their group C*-algebras C*(G)), all representations 
are completely reducible, and a representation is primary iff it is a (possibly infinite) 
multiple of some irreducible representation. However, this is not the right picture for 
general groups or C*-algebras, which requires some discussion. In preparation, we 
call some representation 7’(A) on a Hilbert space H’ C H a subrepresentation of 
a representation 2(A) on H, written 2’ C 7, if 2’ = mq. Subrepresentations 7’ 
of correspond to projections e € 2(A)’, such that z'(a) = e(a). It follows that 
7 (A) and (A) have equivalent subrepresentations iff there exists a nonzero partial 
isometry w : Hj > Hp such that wa) (a) = m(a)w for alla € A. 


Definition 8.18. Two representations 1 and % of a C*-algebra A are called: 


1. equivalent if there is a unitary u: H,; — H such that um(a)u* = m(a) (a € A); 

2. quasi-equivalent if every subrepresentation of ™ has a subrepresentation that 
is equivalent to some subrepresentation of 7, and vice versa; 

3. disjoint if they do not have any equivalent subrepresentations. 


We say that two states @, and @2 on A equivalent, disjoint, or quasi-equivalent if 
the corresponding GNS-representations Nw, and Nw, have the said property. 


In other words, 7 and 7 are quasi-equivalent iff 2; has no subrepresentations dis- 
joint from 7, and vice versa. This, in turn, is equivalent to the property that the set 
of z;-normal states on A, i.e. states of the form a++ Tr(paj(a)) with p € F(H;), is 
the same for i = | as it is for i = 2. Contrapositively, 7, and 7 are disjoint iff no 
state exists that is both 2,-normal and 2-normal. For example, taking A = C(X), 
in which case states are probability measures on X, equivalence and disjointness 
of states recovers the usual notions of equivalence and disjointness of measures, 
respectively (i.e., having the same null sets and having disjoint supports). 
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Proposition 8.19. For any state @, if @ =t@, + (1—1t)@ for some t € (0,1), then 
@, and @ are disjoint iff there is a projection e € AS, = M(A)! 1 %@(A)” such that 


TA) ety = Ma, (A); (8.136) 

Te A) lel Hy = Mo, (A). (8.137) 

Since subrepresentations of 2 (A) always correspond to projections ¢ € 1(A)’; the 
key assumption being made here is that e also lies in the weak closure 1(A)”. 


Proof. One direction is easy: if (8.136) - (8.137) hold, then (arguing by contradic- 
tion) equivalent subrepresentations 7) (A) of 7%, (A) and 72(A) of %@, (A) are given 
by projections e; < e and e) < e+ = 1y,, —e, respectively, through 


7(a) = Ta (@)\e;He» (i = 1,2,a eA), (8.138) 


and the partial isometry w on Hw whose restriction to e; Hm implements a (unitary) 
equivalence between 7) (A) and 7(A) by definition satisfies w*w = e1, ww* = eo. 


Moreover, e; < e implies we = w and e2 < et implies ew =w, which together give 


e+we = w. Furthermore, again by definition, w € 1»(A)’. If now e € 1@(A)”, then 


we = ew. Combining these equalities gives w = 0, which is the desired contradiction. 


Lemma 8.20. For any functional o' € A* such that 0 < @! < @, where @ € S(A), 
there is an operator C € Mw(A)! on He such that 0 <e¢ <1 and 


o' (a) = (Q6,CMo(a)Qu) (aE A). (8.139) 
In particular, there is a vector & € Hw such that 
O(a) = (§,F0(4)$ )Ho- (8.140) 
Proof. Cauchy—Schwarz for the positive semidefinite form (a,b)' = @'(a*b) gives 
|co'(a*b)|* < @! (a*a)@' (b*b) < @(a*a)@(b*b) = || 70, (a) Qa, ||"|| Fo; (b) Qo’. 


Hence we obtain a well-defined positive quadratic form B on Hq, initially defined 
on the dense domain %@(A)Q@ X Tw(A)Qe by the formula 


B(%o(a)Qo,%o(b) Qo) = o'(a*b), (8.141) 


and extended to Hy <x Hg by continuity; the above inequality immediately gives 
|B(@, W)| < ||@||||w||, and hence Proposition B.79 yields an operator 0 <c < ly 
such that B(@, yw) = (9, cw). With (8.141), this gives (8.139). We now compute 


0! (a*b*d) = B( tq (ba) Qo, %o(d) Qo) = (Mo (a) Qo, Mo (b*)cto(d) Qo) 
= B(te(a) Qe, Me (b*d) Qo = (Mo (a)Qo,cKo(b*)Mo(d)Qo), 


so that [c, %@(b*)] = 0 for each b € A, i-e., c € Mo(A)'. Writing c = ci with ct =c1, 
and then € = cj Qe, completes the proof. 
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We continue the proof of Proposition 8.19 in the converse direction. Assume 
@=1@+(1—t)@ =o} + 05, (8.142) 


with @; =1@, and 5 = (1 —t)a@y, so that 0 < | < @ and 0 < a < @. It follows 
from the first claim in Lemma 8.20 that there is c € B(H) as stated such that 


01 (a) = (Qo, co (a) Qo); (8.143) 
05 (a) = (Qe, (1H —¢)%a(a)Qo), (8.144) 


where (8.144) follows from (8.143), (C.196), and @ = O} + ,. Define o’ € A* by 
@! (a) = (Qo,c(1Ay —¢C)Mo(a) Qo). (8.145) 


We have 0 < @! < @} (since c(1H, —c) < c) as well as 0 < @! < @} (since also 
c(1H_ —¢) < 1x, — ©). Now assume that @, and @) are disjoint. Applying (8.140) 
with @ ~~ @; shows that w’ is z,-normal as well as 22-normal, so that it follows from 
the remarks following Definition 8.18 that @’ = 0. Since Qe is cyclic for Z@(A) by 
the GNS-construction, this implies c(1q,, —c) = 0, and hence c* =c. Since c > 0, 
which implies c* =c, it follows that c is a projection, henceforth called e. Therefore, 


01 (a) = (Q0,eTo(a)Qe)/||eQel|*; (8.146) 
O(a) = (Qo,€+ Koa) Qo) /\|e* Qoll’, (8.147) 
where t = ||eQq||?. We see from these formulae and Proposition C.91 that 7, and 
Tm, are equivalent to the restrictions of 1» to eH@ and eH, respectively; under 
this equivalence, the cyclic vectors Qo, and Qe, correspond with eQ/|/eQe|| and 
e+ Qe@/||e+Qe||, respectively. Since e € Tm(A)’ by Lemma 8.20, it only remains to 

be shown that e € 1(A)”. To this effect, for any b € 1(A)’ and y € Hw, define 

@"” EA’; 
WEY — [ok il 

@" (a) = (e~beW, Ho(a)e~ bey). (8.148) 


Then @” is positive, as well as 7,-normal, the latter because of the presence of the 
projection e+ and (8.147). But for a € At we have the inequalities 


0< @"(a) <|le*b||’ (ew, to(a)ey), (8.149) 
so that 0 < @” < @/ for the state (assuming ey is a unit vector) 
@! (a) = (W,eT@(a)ey). (8.150) 


Since ey € eH, the latter state is 1 ,-normal, so that oy is itself 7,-normal by 
Lemma 8.20 (which argument by now should sound familiar). Again invoking dis- 
jointness of @, and @y, it follows that w” = 0, which, since y was arbitrary, in turn 
yields e+ be =0 for any b € M(A)’. This forces e € M@(A)”. 
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The first of the following corollaries to Proposition 8.19 is Hepp’s Lemma: 


Lemma 8.21. Let 1: A > B(H) be a representation of A, and let W,,W2 be unit 
vectors in H. Then the vector states @;(a) = (W;,2(a)W;) (i = 1,2) are disjoint iff 


(Wi, 2(a)Wo) =0 (ae A). (8.151) 


Proof. Take, for example, @ = $(@; + @) in Proposition 8.19. 


Corollary 8.22. 1. Two primary states are either disjoint or quasi-equivalent. 
2. A state is primary iff it has no convex decomposition into disjoint states. 


Recall that a state is pure if it has no nontrivial convex decomposition whatsoever. 
The analogy between pure states and primary states may be completed as follows: 


e @ pure + %@(A)’ = C- 1 (cf. Theorem C.90); 
e © primary © %w(A)' 1 %@(A)” = C- 1 (cf. Definition 8.17). 


A physical property of primary states is that the corresponding correlation functions 
have a clustering property of a kind that may even be experimentally accessible: 


Theorem 8.23. A state @ on a quasi-local C*-algebra A (8.130) has trivial algebra 
at infinity, i.e., AG =C-1, iff it is clustering, in the following sense: for eachac A 
and € > 0 there is a finite A CZ such that for all b € Ay: with \|b|| = 1 one has 


|@(ab) — w(a)@(b)| <e. (8.152) 


In particular, if @ is primary, then it is clustering and hence (8.152) holds. 


Proof. The complete proof is quite technical, but the main idea is as follows. Choose 
finite regions A, moving to infinity (i.e., eventually avoiding any given A), and pick 
elements cp € A4,), ||Cn|| = 1. The sequence (%@(cn)) in B(H@) has a weakly con- 
vergent subsequence with limit c € B(Hw). This follows from the Banach—Alaoglu 
Theorem B.48, applied to B(H@) seen as the dual space of B,(H)): on the unit 
ball, the corresponding weak*-topology on B(Hq) coincides with the weak operator 
topology, so that the unit ball in B(H@) is weakly compact and the theorem applies. 


e By von Neumann’s Bicommutant Theorem C.127 we have c € %m(A)”. 
e By Einstein locality (8.132) and the delocalization of the A, also c € H(A)’. 


Hence c € A‘, and by a more refined argument (which is unnecessary if if At, = A‘), 
even c € Aj. So if AG = C- 1 we have c = (Qy,cQe@)- 1. On the other hand, 


(Qo,CQo) = lim(Qg, Ho (Cn) Qo) = lim @(cn), 
n n 
so that we may compute: 
lim @ (acy) = lim(Q@, To (4) Xo (Cn) Qo) = (Qo, %o(a)cQe) = O(a) lim @(cn). 
n n n 


Thus for any € > 0 there is an N such that |@(ac,) — @(a)@(cy)| < € for alln > N. 
To derive (8.152) from this, an easy reductio ad absurdum argument suffices. 
The converse direction follows from Kaplansky’s Density Theorem C.131. 


8.6 Quantum spin systems: Bundles of C*-algebras 323 


8.6 Quantum spin systems: Bundles of C*-algebras 


In this section we reformulate the theory of quantum spin systems in the continuous 
C*-bundle language of §8.2. First, for each N € N we define Ay € Pr(Z) by 


Ay = {x € Z4 | ||x|| < N}. (8.153) 


We then have the following analogue of the continuous bundle of C*-algebras A‘) 
of C*-algebras of Theorem 8.8. The base space remains J = 1/N Cc [0,1], where 
N = {1,2,...,00} (seen as possible values of 1/h), and the fibers are given by 


Ap =A=limyAay = LU Avy; (8.154) 
NEN 
Ain = Aay = B(Hay) (NEN), (8.155) 


cf. (8.128) - (8.130), still assuming dim(H) < °. As before, the topology of this 
bundle is defined through its continuous cross-sections (a)/y) yey, Which are the 
analogues of the quasi-local sequences of Definition 8.7. Given (8.154) - (8.155), 
each fiber algebra Aj /y is a subalgebra of Ap, and some sequence (a1/N) ven simply 
defines a continuous cross-section of the bundle iff within A (i.e. in norm) we have 


lim a4 jy = a0. (8.156) 


In other words, a sequence (41/N)NeN with a1 /y € Aj/y CA is quasi-local in the 

sense of Definition 8.7 iff it converges in A (i.e., iff it is Cauchy in the norm of A). 
The continuous bundle of Theorem 8.4 makes equally good sense for quantum 

spin systems. First, with B = B(H) ~ M,,(C), the fibers are obviously given by 


A\) = C(S(B(H))); (8.157) 
Al, = B(Hay)- (8.158) 


Second, the continuous sections are once again specified via symmetrization maps 
Sun : B(Ha,,) > B(Aay), (8.159) 

defined similarly to (8.39), namely via canonical symmetrizers 
Sw : B(Hay) > B(Aay ) (8.160) 


that are defined 4 la (8.35) - (8.36), where this time the tensor product and ensuing 
permutation in (8.35) are over all sites x € Ay. Regarding aj/y € B(Ha,,) as an 
element ay of B(Ha,,) via the embedding A,,, @ Any, we finally define Syn by 


Su.n(41/m) = Sw(4i/m)- (8.161) 
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Symmetric and quasi-symmetric sequences may then be defined exactly as in 
Definitions 8.2 and 8.3; each quasi-symmetric sequence (a, IN) NeN duly has a limit 


ane Aw) given by (8.46), where wo” is defined as in (8.47), once again with a tensor 


product over all sites x € Ay. By definition, the continuous sections of the bundle 
(8.157) - (8.158) are then given by the quasi-symmetric sequences. 

Although the fibers A in (8.154) and C(S(B(A))) in (8.157) are as wide apart as 
they could possibly be, they stunningly arise as limit algebras at i = 0 (i.e., N = 00 
or A = Z*) for the same fiber algebras (8.155) and (8.158) at i > 0 (i.e., N < © or 
AE P, (ZA )). As in §8.2, the difference lies in the choice of the topology on the 
bundle, defined via the continuous sections, which in the first case are the quasi-local 
sequences, and in the second are the quasi-symmetric (i.e., macroscopic) ones. 

An interesting connection between these bundles can be obtained via the follow- 
ing concept, which in a way justifies the introduction of the bundles themselves. 


Definition 8.24. A continuous field of states on a continuous bundle of C*-algebras 
with fibers (Ay jy) ve is a family (1) Nex Where 


Min € S(Atyw): (8.162) 
Jim 1; (a1yw) = @o(a0), (8.163) 

for each continuous cross-sections (a1/w). In that case, we write 
Oy = lim oj, (8.164) 


despite the fact that all states in question may be defined on different C*-algebras. 
For example, any state @ on Ag = A as in (8.154) defines a continuous field: 


Proposition 8.25. For any state @ € S(A), the set (@/v) vex Of States defined by 


Mp = O; (8.165) 
O1/n = DA y> (8.166) 


is a continuous field of states on the bundle with fibers (8.154) - (8.155). 
Proof. We use the notation of Definition 8.7. For local sequences (8.57) we have 
M1 /y(41/n) = O(a) = O(41/m), 


for all N > M. Since ap = ayy, this equals @o (ao). For quasi-local sequences, ag is 
the limit of the sequence (a, /y) in the norm of A, so that @(a,/y) + (ao). 


Definition 8.26. A state  € S(A) is macroscopic if limy_,.. @(a1/y) exists for any 
(quasi-) symmetric sequence (a, jy). 


It does not matter whether we put “symmetric” or “quasi-symmetric” here, since 
existence of the limit for symmetric sequences implies its existence on quasi- 
symmetric sequences. Indeed, using the fact that ||@|] = 1, we may estimate 
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|@(a1/v) — @(a1/m)| < |@(41/w) — (41) 


+ | [lai — Gryll + [lay — G1yall, (8.167) 


for any sequence (41/7). Using Definition 8.3, and hence taking (4) /) symmetric, 
we see that if (@(d,/y)) is a Cauchy sequence, then so is (@(4aj/y)). 


Proposition 8.27. A macroscopic state @ determines a state oo) on C(S(B)) by 
(c) = |j 
@) (ao) = jim @(a1/); (8.168) 


where (a,x) is any quasi-symmetric sequence with limit ay € C(S(B)), cf. (8.46). 


Proof. First, note that oo is independent of the choice of the approximating se- 


quence (a1 /,), Since by the same argument as in the proof of Proposition C.126, if 
a\/N — ao as well as ai iy — ag, we have 


Jim jai — 4 jy] = lao — aol] = 0, (8.169) 


and because ||@|| = 1 for any state @, we also have 
| (ary — 44 /y)| S \laryw — 44 jl. (8.170) 
Eqs. (8.169) - (8.170) obviously imply 
Jim (41 jy) = Jim (a) jy). (8.171) 
We next show that if aj /y — ao and bj; — bo in the sense of (C.560), then 
a1 /nD\/N — aobo. 


If (a1) is a symmetric sequence a la (8.43), and likewise (b,/), where we may 
assume without loss of generality that M is the same for both, then 


ao(p) = p™ (am); (8.172) 


where p € S(B), and likewise for bg. Using (8.38), we obtain 
Jim p™ (arjbijw) = po (aij) p™ (b1/m) = ao(P)bo(p) = (aobo)(p). (8.173) 


In particular, if a1 /y — ao, then aj), ,/y — agao. Since @ is a state, it follows 


1/N 
that oo (ajao) = 0, and since also ao (1 s(B)) = 1 (because the sequence with 


ayjy = 1 Hy, converges to 1s(a(H))), the claim follows for symmetric sequences. 
For quasi-symmetric sequences (a; /y) the result follows by approximating (a; /y) 
with symmetric sequences (cf. Definition 8.3). 
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Each state oo eS (A) is represented by a probability measure pl on the state 
space S(B(H)) of B(H). We compute this measure if @ € S(A) is permutation- 
invariant in that each restriction @1/y = O47 a) is invariant under the natural 
action of the permutation group Gj),,; on B(Ha,) = @xcayB(H), where N € N 
and |Ay| is the number of points in Ay (as in the case of B® in §8.2). It fol- 
lows from the Quantum De Finetti Theorem 8.9 (and the fact that that the set 
S©=(A) of permutation-invariant states on A is a so-called Bauer simplex) that each 
permutation-invariant state @ € SS~(A) takes the form 


o=[ duip)p>. (8.174) 
S(B(H)) 


where LU is some probability measure on S(B(H)), and p € S(B(H)); the associated 
state p™ on A is defined by its values on each Ay, C A via the isomorphism 


Ady & @xeAyB(H). (8.175) 


Furthermore, the integral in (8.174) is defined weakly, i.e., for any a € A the number 
@(a) is obtained by integrating the function p ++ p®*(a) on S(B(H)) with respect to 
u. In particular, @ € 0,SS~(A) iff p is a Dirac measure on S(B(H)). 


Proposition 8.28. Each permutation-invariant state @ € S&*(A 


(cf. Definition 8.26), and the probability measure UL on S(B(H) 
via (8.168) coincides with the one appearing in (8.174). 


is macroscopic 


) 
) defined by of” 


Proof. Let (ay in) be a symmetric sequence (the quasi-symmetric case follows from 
this), so that a, /y = Su.n(41/m) for some M whenever N > M, cf. (8.43). The limit 


ag € C(S(B(H))) is given by (8.172), so that state oo on C(S(B(H))) defined by 


(c) = 
al(f= f cau HOLE) (8.176) 


satisfies the required condition 


a 


a\jm) = a? (ao). 


lim @1/y(a =O /y(a = i d 

Pie in ( 1/N) 1/M( 1/M) s(BUH)) L(p) p 
To proceed we make the following technical assumption on @ € S(A) (which is 
satisfied in typical physical models): if %@(a,/y) — 0 weakly in B(Hq), for some 
sequence (a, /y) where a1 /y € Ajjy, then %(41/y) Qo — 0 in B(H@) (in norm). 


Theorem 8.29. Assume that the state w in part I below (and likewise the states @, 
and @) in part 2) satisfies the above technical condition. Then: 


1. If @ is a primary macroscopic state on A, then the corresponding state on) is 


pure, i.e., the probability measure wt on S(B(H)) is a Dirac measure. 
2. If @, and @2 are quasi-equivalent primary macroscopic state on A, then fy = Lo 
(and hence if Ly 4 Lz, then @, and @y are disjoint). 
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The techniques in the proof below can be used to show that our additional assump- 
tion is equivalent to: if (8.178) below holds weakly in B(H,), then it also holds 
strongly. Thus we could have redefined a macroscopic state @ as one for which the 
strong limit limy—o%(a1/y) exists in B(H@) (and some authors indeed do so). 


Proof. We first show that if @ is a primary macroscopic state on A, and (a; /n) is 
symmetric (from which the quasi-symmetric case duly follows) such that 


jim @(a,/y) = @, (8.177) 


then, in the weak operator topology on the GNS-representation space B(Hj), 


lim A (ayy) = O- lity. (8.178) 
N- oo 


To this end, we first note that ||q@; /|| is uniformly bounded in N: if (a1) is sym- 
metric, as in (8.43), then obviously ||; /y|| = ||@1/m|| for all N > M, so that if (a, /y) 
is merely quasi-symmetric we have ||a1/y|| < ||@1/m|| + € for all N > M, where € 
and M are the quantities appearing in Definition 8.3. Hence it is enough to establish 
the weak limit (8.178) between states in a dense set, viz. Z»(b) Qo, where b € A, 
or even in UyAj/y. Furthermore, using the polarization identity (A.5) and (C.8) - 
(C.9), it is enough to prove that for each K € N and b € A, /x, we have 

lim @(b*a, yb) = aa(b*b), (8.179) 


N-00 


since by the GNS-construction we obviously have 


Theorem 8.23 implies (or even states) that if @ is primary, for each b € A and é > 0 
there is M € N such that for all a ¢ A), with ||a|| = 1, we have 


|@(b*ba) — w(b*b)w(a)| <e. (8.181) 


Assuming b € Aj/x, we first note that limy—-[41/y, 5] = 0 in norm (even though 
lim y_-+00 4} /N does not exist in norm), and secondly that, for any given M EN, if 
a/v is the same as a1 /y except that in any term b; ®--- ® bj,,, that contributes to 
a1/n we replace bj ~» 1 whenever bj € A; ;y, then 


Jim lla; — aryl =0. (8.182) 


Given (8.177), these facts with (8.181) immediately give (8.179) and hence (8.178). 
According to (8.177) and (8.178), the state @{°’ € S(C(S(B(H)))) is given by 


(ay) = jim (Qu, (a1) Qe); (8.183) 
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where aj /y is some symmetric sequence converging to — ao in the sense of (C.560); 
as in the proof of Proposition 8.27, the left-hand side is independent of the particular 
choice of this sequence. The proof of Proposition 8.27 also showed that if a1 ;y — ao 
and by /y — bo, then a; yb, /y — abo, so that 


a (aobo) = Jim (Qo, Too(a1 jw 1/n) Qe) 


(Q0,%o(a1/y) — &- LHe) Fo(bi jy) Qo) + aB, 


lim 
N- oo 


where @ is defined by (8.177), and likewise 8. At this point that we need our ad- 
ditional assumption, which, together with uniform boundedness of || 7 (a ,y)|| and 
hence of ||%(@1/1)Qe]|| in N yields that the first term in the second line is zero. 


(c) 


Therefore, Oo is multiplicative and hence pure (cf. Proposition C.14). 

To prove the second claim, first suppose @ and @» are quasi-equivalent. In that 
case, up to unitary equivalence, either 2%, is a subrepresentation of %w,, or vice 
versa; assume the former. We then have a projection e € 7%, (A)! such that 


To, (a) = eNe, (a), (8.184) 


for each a € A, and since e = LHe, by construction, eq. (8.178) gives 


lim Mo, (ayy) = 1 (8.185) 
N-co 
jim 0, (a1jx) = 62° Ute: (8.186) 


Multiplying both sides of (8.186) with e gives Q = Q. 


Corollary 8.30. A permutation-invariant state @ € S°(A) is primary iff the cor- 
responding measure | in (8.174) is a Dirac measure, and it is pure iff the latter is 
supported by a pure state on B(H). 


Proof. In the first claim, the inference from “primary“ to “Dirac” obviously follows 
from Theorem 8.29. The converse direction is a consequence of the commutation 
theorem (C.329) for von Neumann algebras, combined with the fact that each rep- 
resentation of B(H) for finite-dimensional H is primary (which in turn follows from 
the fact, not proved in this book, that B(H) has just one irreducible representation, 
up to equivalence). The second claim follows from Proposition C.105. 


Finally, one macroscopic state generates many others. A folium in the state space 
S(A) of a C*-algebra A is a convex, norm-closed subspace ¥ of S(A) with the 
property that if @ € ¥ and b€ A such that w(b*b) > 0, then the “reduced” state 
@p : a+ @(b*ab)/a@(b*b) must be in -¥. For example, if 7 is a representation of 
A on a Hilbert space H, then the set of all density matrices on H (i.e. the z-normal 
states on A) comprises a folium ¥;. In particular, each state @ on A defines a folium 
PF = Fz, through its GNS-representation 7. It then follows from cyclicity of the 
GNS-representation that each state in the folium -F of a macroscopic state @ € S(A) 
is automatically macroscopic and even has the same limit state o as o. 
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Notes 


§8.1. Large quantum numbers 

Theorem 8.1 has been adapted from Landsman (1998b); the proof relies on Si- 
mon (1980), who, generalizing the case of SU (2) treated by Lieb (1973), in turn uses 
the coherent states for Lie groups introduced by Perelomov (1972, 1986). Duffield 
(1999) gives the details of the method of steepest descent used in proving (8.30). 
Although this material was inspired by Bohr’s Correspondence Principle, at the end 
of the day the relationship may seem remote. 


§8.2. Large systems 

The theory in this section, which elaborates on Landsman (2007), is a reformula- 
tion in terms of continuous bundles of C*-algebras of the formal parts of a series of 
papers on quantum mean-field systems by Raggio & Werner (1989, 1991), Duffield 
& Werner (1992a,b,c), and Duffield, Roos, & Werner (1992). These models have 
their origin in the treatment of the BCs theory of superconductivity due to Bogoli- 
ubov (1958) and Haag (1962); for further references see the notes to §10.8. 


§8.3. Quantum de Finetti Theorem 
Theorem 8.9 is due to Stormer (1969), whose proof was based on the fact that 
the G..-action on B® is asymptotically abelian, in that for any a,a’ € B® one has 


inf {||[@p(a),a']||,p € Gao} = 0. 


This implies that SS» (B®) is a Choquet simplex, which quickly leads to (8.66). Our 
proof is taken from Hudson & Moody (1975). See also Caves, Fuchs, & Schack 
(2002a). Finite-size corrections to Theorem 8.9 are studied e.g. in Konig & Mitchi- 
son (2009). Corollary 8.11 is due to Hewitt & Savage (1955), who credit Jules Haag 
(rather than De Finetti) for the binary case (i.e., X = {0,1}). See Kallenberg (2005) 
for an exhaustive account of such results (in classical probability theory). 

Proposition 8.12 is taken from Diaconis & Freedman (1980), who also give 
finite-size corrections to Corollary 8.11, as follows. Let a permutation-invariant 
probability measure vy on X“ be K-exchangeable, so that there is a permutation- 
invariant probability measure vy.x on XN+K whose restriction to X% is vy. Let 
Py+x be the probability measure on Pr(X) defined by vyix as in (8.85), i-e., 
Py+K(A) = Vw+Kx (BycglAy and finally define 


VN+K = [ a dPyik(u)m**, 


as in (8.79). Then, in terms of the usual norm on the Banach dual C(X”)*, 


K(K-1) 


Vn — Vw || < 
Iw — val sO 


Proposition 8.13 is stated without proof in Kingman (1978). See Mackey (1974) or 
Gray (2009) for ergodic theory in connection with probability theory. 
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Of course, there are numerous results in probability theory that do not share the 
problems of the law of large numbers. For example, in the situation (8.94), for any 
€ > 0 one has the Chernoff—Hoeffding bound 


N | 1 y | > e| < —2Ne2 
Say, xi Ze Se ’ 
Lu Ne P 


which is superior to the weak law of large numbers, i.e., for every € > 0, 


1x 
‘ a fe ie _ 
Jim u (5 P| =0, 


which from the point of view of Earman’s Principle is already a marked conceptual 
improvement over the strong law (but which is mathematically weaker). 


§8.4. Frequency interpretation of probability and Born rule 

The Kolmogorov quote is from Fine (1973, p. 94), which even 40 years later is 
still to be recommended as one of the best (technical) book on the foundations of 
probability theory. See also Hajek & Hitchcock (2016) for a comprehensive recent 
survey of the philosophy of probability. The Keynes quote is from Hacking (2001, 
p. 149), which is a very elementary introduction to the foundations of probability 
At a more advanced level see also Gillies (2000), whilst Howson (1995) is a useful 
brief survey. 

The original version of the Principal Principle (Lewis, 1980) equated probabil- 
ity (or chance) as subjective degree of belief (i.e. credence) with objective chance 
(though in the single case as opposed to relative frequency. Our own version in the 
main text is meant to clarify the relationship between singe-case probabilities and 
long run frequencies, both seen as objective. 

Attempts to derive the Born rule started with Finkelstein (1965) and were contin- 
ued e.g. by Hartle (1968), Farhi, Goldstone, & Gutmann (1989), Van Wesep (2006), 
Aguirre & Tegmark (2011), Moulay (2014), and others, partly based on indubitable 
mathematical arguments in the spirit of the strong law of large numbers supplied 
by e.g. Ochs (1977, 1980), Bugajski & Motyka (1981), Pulmannova & Stehlkova 
(1986). Such attempts (typically presented as claims) provoked valid critiques of the 
kind mentioned in the main text from e.g. Cassinelli & Sanchez-Gémez (1996) and 
Caves & Schack (2005). For a balanced account see also Cassinelli & Lahti (1989). 
Infinite tensor products of Hilbert spaces were introduced by von Neumann (1938). 

Our approach, which is sympathetic to both sides of the dispute, is a vast ex- 
pansion of Landsman (2008). The existence of eco as in (8.109) - (8.110) is based 
on the same extension argument that proves the Kolmogorov existence theorem for 
infinite product probabilities, see e.g. Dudley (1989), proof of Theorem 8.2.2, and 
Van Wesep (2006), who carries out the proof for X = {0, 1}. 

There is also a large (and inconclusive) literature on alleged derivations of the 
Born rule in the context of the Many-Worlds (i.e. Everettian) Interpretation of quan- 
tum mechanics, which may be traced back from Wallace (2012), who supports such 
derivations, and Dawid & Thébault (2015), who criticize them. 
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§8.5.Quantum spin systems: Quasi-local C*-algebras 

Basic references are Ruelle (1969), Israel (1979), Bratteli & Robinson (1987, 
1997), and Simon (1993); for macroscopic states see Hepp (1972) and Sewell 
(2002). Naaijkens (2013) is a useful brief introduction to quantum spin systems. 

The proof that Haag duality holds for quantum spin systems is far from trivial: see 
Simon (1993), Prop. IV.1.6. In the proof of (8.135), simplicity of A given simplicity 
of each A, is easily inferred from the fact that if J C A is an ideal, then J, =IMA,g is 
an ideal in A, = B(H, ), which must be either zero or A,, both of which contradict 
non-triviality of 7. Theorem 8.23 is a famous result due to Lanford & Ruelle (1969), 
partly anticipated by Powers (1967). For a complete proof see also Simon (1993), 
Theorem IV.1.4. 


§8.5.Quantum spin systems: Bundles of C*-algebras 

This section was inspired by Landsman (2007), 86, and Gerisch (1993). 

Folia of states (in the sense meant here) were introduced by Haag, Kadison, & 
Kastler (1970), but note that the name “folium” is poorly chosen, since S(A) is by 
no means foliated by its folia (for example, a folium may contain subfolia). 


Chapter 9 
Symmetry in algebraic quantum theory 


In §3.9 we defined symmetries of classical physics as symmetries of either Poisson 
manifolds or Poisson algebras; these notions are equivalent. At the bare level of the 
underlying phase space X, merely seen as a locally compact space (rather than a 
Poisson manifold), the key result establishing this equivalence is this: 


Theorem 9.1. Let X and Y be locally compact Hausdorff spaces. Each isomorphism 
a: Co(Y) + Co(X) is induced by a homeomorphism 9 : X > Y via & = 9* (and so 
each automorphism of Co(X) is induced by a homeomorphism of X). 

More generally, if A and B are commutative C*-algebras, then each isomorphism 
a :A— B is induced by a homeomorphism @ : £(B) — X(A) of the corresponding 
Gelfand spectra via & = Gz! 0 @* oGa, where Ga : A Co(Z(A)) is the Gelfand 
ismomorphism, cf. (C.79), and similarly for B (and so each automorphism of A is 
induced by a homeomorphism of its Gelfand spectrum Z(A)). 


This immediately follows from Theorems C.8 and C.45, and Corollary C.48. 

In Chapter 5 we saw that even in elementary quantum mechanics, where A = 
B(H) for some Hilbert space H, the concept of a symmetry is more diverse, as least 
apparently, since a non-commutative C*-algebra like B(H) gives rise to numerous 
“quantum structures”. The ones we looked at were listed after Proposition 5.3, viz. 


1. The normal pure state space Y\ (H), dressed with a transition probability (2.44). 
2. The normal (total) state space Z(H), seen as a convex set; see Theorem 2.8. 

3. The self-adjoint operators B(H)., on H, seen as a Jordan algebra. 

4. The effects &(H) = (0, 1]zc) on H, seen as a convex poset. 

5. The projections PY(H) on H, seen as an orthocomplemented lattice. 

6. The unital commutative C*-subalgebras @ (B(H)) of B(H), seen as a poset. 


Each structure comes with its own notion of a symmetry, see Definition 5.1. This 
raises two questions, which for B(H) were completely answered in Chapter 5: 


e The possible equivalence of the various notions of quantum symmetry; 
e Unitary implementability of symmetries. 


Indeed, it was found that if dim(H) > 2, then all these notions of symmetry are 
equivalent, as well as unitarily implementable a la Wigner; see Theorem 5.4. 
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9.1 Symmetries of C*-algebras and Hamhalter’s Theorem 


In this chapter we generalize this analysis from A = B(H) to arbitrary C*-algebras 
A, which for simplicity we assume to have a unit 14. See §C.25 for terminology. 


Definition 9.2. Let A be a unital C*-algebra. 


1. The pure state space P(A) = 0,S(A) is the extreme boundary of the state space 
S(A), seen as a uniform space equipped with a transition probability 


t(@, @') =inf{@(a) |a€A,0<a< 14,@'(a) = 1}. (9.1) 


A Wigner symmetry of A is a uniformly continuous bijection W : P(A) + P(A) 
with uniformly continuous inverse that preserves transition probabilities, i.e., 


t(W(@)W(q@’)) = t(@, @'), w,@' € P(A). (9.2) 


If A = B(H), Proposition C.177 guarantees that the above expression reproduces 
the standard quantum-mechanical transition probabilities (2.44), but compared 
to this special case, one novel aspect of P(A) is that all pure states are now taken 
into account (as opposed to merely the normal ones, which notion is undefined 
for general C*-algebras anyway). Another is that in order to obtain the desired 
equivalence with other structures, the set P(A) should carry a uniform structure, 
namely the w*-uniformity inherited from A*. 

2. The state space S(A) is the set of all states on A, seen as a compact convex set in 
the w* -topology inherited from the embedding S(A) C A*. A Kadison symmetry 
of A is an affine homeomorphism K : S(A) — S(A). 

Compared to A = B(A), firstly all states are now taken into account (instead of 
all normal states), and secondly we have added a continuity condition on K. 

. Any C*-algebra A defines an associated Jordan algebra (more precisely, a JB- 
algebra), namely Asa equipped with the commutative product ao b = }(ab+ba). 
A Jordan symmetry J of A is a Jordan isomorphism of (Asa, ©) (or, equivalently, 
an invertible unital linear isometry of (Asa, || -||), which in turn is the same as 
a unital linear order isomorphism of (Aga, <), cf. Lemma C.173). A weak Jor- 
dan symmetry of A is an invertible map J: Asa — Asa whose restriction to each 
subspace Csa Of Asa, where C € (A), is linear and preserves the Jordan product. 

4. The effects in A comprise the order unit interval &(A) = [0,1a], ie., the set of 

all a € Aga such that 0 <a < 14, seen as a convex poset in the obvious way. A 

Ludwig symmetry of A is an affine order isomorphism L: &(A) > &(A). 

The projections P(A) in A form an orthomodular poset (cf. Definition D.1) with 

e<f iffef =eandet =1,4—e; ifA is a von Neumann algebra (cf. Proposition 

C.136), or more generally an AW*-algebra or a Rickart C*-algebra (see §C.24), 

P(A) is even an orthomodular lattice. A von Neumann symmetry of A is an 

isomorphism N : Y(A) + YA) of orthomodular posets. 

The poset @ (A) (lying at the heart of exact Bohrification) consists of all commu- 

tative C*-subalgebras of A that contain the unit 14, partially ordered by inclu- 

sion. A Bohr symmetry of A, then, is an order isomorphism B: @(A) > @(A). 
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The structures 1, 2, 3 (with Jordan symmetries), and 4 are equivalent; see Theo- 
rem C.179 for 1 ++ 2 and Theorem C.172 for 2 <> 3; the equivalence 3 <> 4 is proved 
in exactly the same way as in Proposition 5.21, with Lemma 5.20 for the special case 
A = B(H) replaced by Lemma C.173 (which has the same proof). From 1-4 we pick 
the Jordan algebra structure of A, since it gives the most straightforward results. 

Henceforth, A and B are unital C*-algebras, and we define a weak Jordan iso- 
morphism of A and B as an invertible map J : As, + Bsa whose restriction to each 
subspace Cs, of Asa, where C € @(A), is linear and preserves the Jordan product 
o (so that a Jordan symmetry of A alone is a weak Jordan automorphism of of A). 
Such a map complexifies to a map Jc : A — B in the usual way, i.e. writing ac A 
as a =b+ic, with b* = b and c* =c, cf. (C.9), and put Jc(a) = J(b) + iJ(c)). If no 
confusion arises, we just write J for Jc. We first turn to Bohr symmetries. 


Proposition 9.3. Given a weak Jordan isomorphism J : Aga > Bsa, the ensuing map 
B: @(A) > @(B) defined by B(C) = Jce(C) = J(C) is an order isomorphism. 


Note that as an argument of B the symbol C is a point in the poset (A), whereas 
as an argument of Jc it is a subset of A, so that Jc(C) stands for {Jc(c) | c € C}. 


Proof. The restriction Jic : C + B is a homomorphism of C*-algebras on each com- 
mutative C*-algebra C C A (although J : A — B may not be). Since Jic is injective 
on Cs, (where it coincides with J), it is also injective on C. Hence Jic is isometric 
by Theorem C.62.3, so that its range is closed and therefore J(C) is a commutative 
C*-algebra in B, which is unital if C is. Trivially, if C C D in A (so that C < Din 
@ (A)), then J(C) C J(D) in B (so that J(C) < J(D) in @(B)). 


The converse, however, is a deep result, which we call Hamhalter’s Theorem: 


Theorem 9.4. Let A and B be unital C*-algebras and let B: @(A) > @(B) be an 
order isomorphism. Then there is a weak Jordan isomorphism J : Asa — Bsa such that 
B = Je. Moreover, if A is isomorphic to neither C? nor M(C), then J is uniquely 
determined by B, so in that case there is a bijective correspondence J ++ B between 
weak Jordan symmetries J of A and Bohr symmetries B of A. 


Before proving this, let us explain why C? and M>(C) are exceptional. In the first 
case, @(C?) = {0,1} (with 0 = C- 12 and 1 = C”), which admits just one order iso- 
morphism (viz. the identity map), which is induced by both the map (a,b) +> (b,a) 
and by the identity map on C? (each of which is a weak Jordan automorphism). 

In the second case, the poset @(M2(C)) has a bottom element 0 = C- 1a, as 
before, but no top element; each element C 4 C- 13 of @(M2(C) is a unitary conju- 
gate of the diagonal subalgebra D2(C), with 0 < C but no other orderings. Further- 
more, CNC’ = C- 12 whenever C 4 C’. Hence any order isomorphism of @(M2(C)) 
maps C - I> to itself and permutes the C’s. Thus each map J : Mz(C)sa > M2(C)sa 
whose complexification Jc : Mz2(C) — M2(C) shuffles the C’s isomorphically (as 
C*-algebras) gives a weak Jordan automorphism. For example, take (a,b) +> (b,a) 
on D2(C) and the identity on each C 4 D>(C)); this induces the identity map on 
6 (M2(C). It follows that there are vastly more weak Jordan automorphisms of 
M2(C) than there are order isomorphisms of @(M2(C)). 
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Proof. The key to the proof lies in the commutative case, which can be reduced to 
topology. If A = C(X), any C € @(A) induces an equivalence relation ~c on X by 


x~c yiff f(x) =f) Vf EC. (9.3) 


This, in turn, defines a partition X =|], K, of X (henceforth called 2), whose blocks 
Ky, CX are the equivalence classes of ~c. To study a possible inverse of this proce- 
dure, for any closed subset K C X we define the ideal 


Ik =C(X;K) = {f €C(X) | f(x) =0Vx€ K}, (9.4) 


in C(X), and its unitization Ix =Ix ®C-1y, which evidently consists of all continu- 
ous functions on X that are constant on K. If X is finite (and discrete), each partition 
x of X defines some unital C*-algebra C C C(X) through 


C=) Ir, (9.5) 


Ky,en 


which consists of all f € C(X) that are constant on each block Ky, of the given 
partition 7. In that case, the correspondence C + 7, where 7 is defined by the 
equivalence relation ~c in (9.3), gives a bijection between @(C(X)) and the set 
3B(X) of all partitions of X. For example, the subalgebra C = [x corresponds to the 
partition consisting of K and all singletons not lying in K. Given the already defined 
partial order on @(C(X)) (ie., C < Diff C C D), we may promote this bijection to 
an order isomorphism of posets if we define the partial order <’ on $8(X) to be the 
opposite of the natural one < in which a < a’ (where a and 7’ consist of blocks 
{K,} and {K},,}, respectively) iff each Ky is contained in some K%j, (i.e., 7 is finer 
than 7’). The partial ordering <’ makes §8(X) a complete lattice, whose top element 
consists of all singletons on X and whose bottom element just consists of X itself: 
the former corresponds to C(X), which is the top element of @(C(X)), whilst the 
latter corresponds to C- 1x, which is the bottom element of @(C(X)). 

For general compact Hausdorff spaces X, since C(X) is sensitive to the topology 
of X the equivalence relation (9.3) does not induce arbitrary partitions of X. It turns 
out that each C € @(C(X)) induces an upper semicontinuous partition (abbreviated 
by u.s.c. decomposition) of X, 1.e., 


e Each block K, of the partition 7 is closed; 

e For each block K, of 2, if K, GC U for some open U € G(X), then there is 
V € @(X) such that K, C V CU and V is a union of blocks of a (in other 
words, if K is such a block, then VK = 9 implies K = 9). 


This can be seen as follows. Firstly, if we equip 7 with the quotient topology with 
respect to the the natural map gq: X + 1,x+> Ky if x € Kj, then 7 is compact, for 
X is compact. Moreover, 7 is Hausdorff. To see this, let K, and K,, be two distinct 
points in 7. Recall that x,y € K, if and only if f(x) = f(y) for each f € C. Since 
Ky # Ky, there is some x € Kj, some y € Ky and some f € C such that f(x) 4 f(y), 
whence there are open disjoint U,V C C such that f(x) € U and f(y) EV. 
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Define f : > C by f(K,) = f(x) for some x € Ky. By definition of Ky, this is 
independent of the choice of x € K,, hence f is well defined. Again by definition, 
we have f = fog, hence g~!(f—!)[U] = f~![U], which is open in X since f is 
continuous. Since 7 is equipped with the quotient topology, it follows that f—! [U] 
is open in 7, and similarly f~![V] is open. Moreover, we have f(K,) = f(x) and 
f(x) €U, hence K, € f—![U], and similarly, Ky € f~![V]. We conclude that 7 is 
also Hausdorff. Since g is a continuous map between compact Hausdorff spaces, it 
follows that g is closed. It is a standard result in topology that g is closed iff 7 is a 
u.s.c. decomposition, so we have now proved the latter. 

Consequently, by the same maps (9.3) and (9.5), the poset @(C(X)) is anti- 
isomorphic to the poset §(X) of all u.s.c. decompositions of X in the natural or- 
dering < (which proves that F(X) is a complete lattice, since @(C(X)) is). This is 
still a complicated poset; assuming X to be larger than a singleton, the next step is to 
identify the simpler poset .F2(X) of all closed subsets of X containing at least two 
elements within §(X ), where (as above) we identify a closed K C X with the (u.s.c.) 
partition 2x of X whose blocks are K and all singletons not lying in K (note that the 
poset .¥(X) of all closed subsets of X is less useful, since any singleton in F(X) 
gives rise to the bottom element of §(X)). To do so, we first recall that B is said to 
cover & in some poset if @ < B, and a < y < B implies a = y. If the poset has a 
bottom element, then its covers are precisely its atoms. Furthermore, note that since 
the bottom element 0 of §(X) consists of singletons, the atoms in ¥(X) are the par- 
titions of the form Tx xy} (where x; # x2). It follows that some partition 7 € §(X) 
lies in F2(X) C §(X) iff exactly one of the following conditions holds: 


e ais anatom in §(X), ie, t= Tx, x9} for some x1,x2 € X, x1 # x9; 

e covers three (distinct) atoms in ¥(X), in which case 7 = Ty .x5,x3} Where all x; 
are different, which covers the atoms 7x) x5}, Tx ,x5}, and Hy, a3} 

e Ifa +f are atoms in ¥(X) such that a < m and B < 7, there is an atom y< 2 
such that there are three (distinct) atoms covered by a V y and three (distinct) 
atoms covered by f V y. In that case, 7 = ax where K has more than three el- 
ements: if &© = Hy, xy} and B = Mxs.x4}> then due to the assumption a+ B, 
the set {x1,x2,x3,x4} (which lies in K) has at least three distinct elements, say 
{x1,%2,x3}. Hence we may take y= 7,, 3, in which case @V Y= Mx, 45,25} 
which covers the atoms a, y, and 7;,, »,1. Likewise, we have BVY= Te xy .x5,x4} 
which covers three atoms f, y, and Tl xy x4} 


In order to see that 7 satisfying the third condition must be of the form 7x, assume 
the converse. So 7 contains two blocks K, and Ky, consisting of two or more el- 
ements. Say {x1,x2} C Ky and {x3,x4} C Ky. Then & = my, x,} and Bry, .,} are 
atoms such that a, 8 < 7, and there is an atom y = Tx5 x6} < @ such that there are 
three atoms covered by @ V y, and there are three atoms covered by B V y. It follows 
from the second condition that @ V y = 7, with L a three-point set. This implies that 
{x1,x2}9 {x5,x6} is not empty, from which it follows that @V Y= Wry, x, .x5,x6}- SiM- 
ilarly, we find B VY = Mx, x4.x5,x6} Since {X1,x2,x5,X6} and {x3,x4,x5,X6 } overlap, 
we obtain @V BV ¥ = Myx, x5.x3,x4,x5,45}: Moreover, a,B,y< 7, soaVBVY<Z. 
However, since x; ,x2 € K;,, we must have {x1 ,x2,%3,%4,5,x6 } C Ky by definition of 
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the order on §(X). But since x3,x4 € Ky, we must also have {x1 ,x2,%3,%4,%5,%6} C 

K,,, which is not possible, since K, and K,, are distinct blocks, hence disjoint. We 

conclude that 7 can have only one block K of two or more elements, hence z = 7x. 
Thus -¥2(X) C §(X) has been characterized order-theoretically. Moreover, 


T = Vrex Ux(x); (9.6) 


where K (x) is the unique block of X that contains x. Hence .F2(X ) determines §(X). 
Let X and Y be compact Hausdorff spaces of cardinality at least two (so that 
the empty set and singletons are excluded). By the previous analysis, an order 
isomorphism B : @(C(X)) + @(C(Y)) is equivalent to an order isomorphism 
$§(X) > F(V), which in turn restricts to an order isomorphism -¥2(X) > -¥2(Y). 


Lemma 9.5. If X and Y are compact Hausdorff spaces of cardinality at least two, 
then any order isomorphism F : #2(X) — -¥2(Y¥) is induced by a homeomorphism 
9:X SY via F(F) = @(F), ie, F(F) = Uxer{Q(x)}. Moreover, if X and Y have 
cardinality at least three, then ~ is uniquely determined by F. 


To see the idea, we first prove this for finite X, where Y2(X) simply consists of all 
subsets of X having at least two elements, etc. It is easy to see that X and Y must 
have the same cardinality |X| = |Y| =n. If n = 2, then .¥2(X) =X etc., so there is 
only one map F, which is induced by each of the two possible maps @ : X > Y, so 
that @ exists but fails to be unique. If n > 2, then F must map each subset of X with 
n— 1 elements to some subset of Y with n — 1 elements, so that taking complements 
we obtain a unique bijection @ : X — Y. To show that @ induces F, note that the 
meet A in -¥2(X) is simply intersection M, and also that for any F € -F2(X), 


F =User {a} = Mage la} = (Unger 2), (0.7) 


where A‘ = X\A. Since F is an order isomorphism, it preserves \ = /, so that 


F(F) = Os¢rF({x}°) = eee X\{O()} = (rer PX) })S = Uxer LOX) }- .8) 


Now assume that X is infinite. Let x € X. If x is not isolated, we define @(x) 
as follows. Let G(x) denote the set of all open neighborhoods of x. Since x is not 
isolated, each O € @(x) contains at least another element, so O € ¥2(X). More- 
over, finite intersections of elements of {O : O € O(x)} are still in A2(X). In- 
deed, if O),...,0, € G(x), then ON... O, is an open set containing x, and 
since O1M...NOn C OLN...AOn, it follows that O11... On € -F2(X). Since 
F is an order isomorphism, we find that finite intersections of {F(O) : O € @(x)} 
are contained in .¥2(Y). This implies that {F(O) : O € @(x)} satisfies the finite 
intersection property. As Y is compact, it follows that lL. = oco(x) F(O) is non- 
empty. We can say more: it turns out that 7, contains exactly one element. Indeed, 
assume that there are two different points y,y2 € J;. Then {y1,y2} € A2(Y), so 
F-!({y1,y2}) € Fo(X). Since {y1,y2} € F(O) for each O € @(x), we also find that 
F~!({y1,y2}) C O for each O € O(x). This implies that 
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Fw C () O= {x}, (9.9) 


O€ G(x) 


where the last equality holds by normality of X. But this is a contradiction with F : 
F(X) > F2(Y) being a bijection. So /; contains exactly one point. We define @(x) 
such that {@(x)} =J,. Notice that p(x) cannot be isolated in Y, since if we assume 
otherwise, then Y \ {@(x)} must be a co-atom in ¥(Y), whence F~!(Y \ {@(x)}) 
is a co-atom in -¥2(X), which must be of the form X \ {z} for some isolated z € X. 
Since x is not isolated, we cannot have x = z, so X \ {z} is an open neighborhood 
of x, which is even clopen since z is isolated. By definition of @(x), we must have 
(x) € F(X \ {z}), but F(X \ {z}) =¥Y \ {@(x)}. We found a contradiction, hence 
(x) cannot be isolated. Now assume that x is an isolated point. Then X \ {x} is a co- 
atom in #2(X), so F(X \ {x}) is a co-atom in -¥2(Y), too. Clearly this implies that 
F(X \ {x}) =Y \{y} for some unique y € Y, which must be isolated, since Y \ {y} 
is closed. We define @(x) = y. 

In an analogous way, F~! induces a map y: Y — X. We shall show that @ and 
w are each other’s inverses. Let x € X be isolated. We have seen that @(x) must be 
isolated as well, and that @(x) is defined by the equation F(X \ {x}) = Y \ {@(x)}. 
Since F is an order isomorphism, we have X \ {x} = F~!(Y \ {@(x)}). Since @(x) 
is isolated, we find by definition of y that w(@(x)) = x. In a similar way we find 
that @(w(y)) = y for each isolated y € Y. Now assume that x is not isolated and let 
F € F2(X) such that x € F. Then 


{e(x)} = (] FO) C(}{F(O) : 0 open, F CO} 
O€O(x) 


=F (No: O open, F C 0}) =F(F), (9.10) 


where the last equality follows by completely regularity of X. The penultimate 
equality follows from the following facts. Firstly, the set (}{O : O open, F C O} 
is closed since it is the intersection of closed sets. Moreover, the intersection con- 
tains more than one point, since F contains two or more points and F C O for each 
O. Hence (\{O : O open, F C O} € F(X), and since F is an order isomorphism, 
it preserves infima, which justifies the penultimate equality. Hence @(x) € F(F) for 
each F € F(X) containing x. Since x is not isolated, @(x) is not isolated either. 
Hence in a similar way, we find that w(@(x)) € F~!(G) for each G € Fo(¥) con- 
taining @(x). Let z= y(@(x)). Combining both statements, we find that z € F for 
each F € ¥)(X) such that x € F. In other words, z € (\{F € F(X): x € F}. Since 
x is not isolated, we each O € G(x) contains at least two points. Hence 


(\{F € F(X) :x€ F} C( \{O: O € O(x)} = {x}, (9.11) 


where we used complete regularity of X in the last equality. We conclude that z = x, 
so w(@(x)) =x. In a similar way, we find that @(w(y)) = y for each non-isolated 
y € Y. We conclude that @ is a bijection with inverse p~! = w. 
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Continuing the proof of Lemma 9.5, we have to show that if F € F(X), then 
0|F | = F(F). Let x € F. In the proof that @ is a bijection we already noticed that 
(x) € F(F) if x is not isolated. If x is isolated in X, then we first assume that F 
has at least three points. Since {x} is open, G = F \ {x} is closed. Since F contains 
at least three points, G € .¥2(X). So G is covered by F in .¥2(X), so F(F) covers 
F(G). It follows that there must be an element yg € Y \ F(G) such that 


F(F) = F(GU {x}) =F(G)U {ye}. (9.12) 
Both GU {x} and X \ {x} are elements of -¥(X), so 


F(G) = F(GU {x} NX \ {x}) = F(GU {x}) NF(X \ {x}) 
= (F(G)U {ya} ) OV \ {9)}), (9.13) 


where F(X \ {x}) =Y \ {@(x)} by definition of values of @ at isolated points. Since 
x ¢ Gand F preserves inclusions, this latter equation also implies F(G) CY \ {@(x)}. 
Hence we find 


F(G) = (F(G)U {ya}) NY \ {9@)}) = FG) U ye} NY \{9)}). O14) 


Thus we obtain {yg}NMY \ {@(x)} C F(G), but since yg ¢ F(G), we must have 
(x) = yg. As a consequence, we obtain F(F’) = F(G) U{@(x)}, so @(x) € F(F). 

Summarizing, if F has at least three points, then p(x) € F(F) forx € F, regardless 
whether x is isolated or not. So @[F] C F(F) for each F € ¥2(X) such that F has at 
least three points. Let F € .¥2(X) have exactly two points. Then there are F), Fy € 
#7(X) with exactly three points such that F = F, M Fo. Then since @ is a bijection 
and F as an order isomorphism both preserve intersections in 7 (X), we find 


9lF] = [A OF] = o[F)1e[F] CFA) OF) = F(A) =F(F). (9.15) 


So 9|F] C F(F) foreach F € ¥2(X). Ina similar way, we find @~'[G] C F~![G] for 
each G € ¥2(Y). So if we substitute G = F(F), we obtain g![F(F)] C F. Since 
is a bijection, it follows that F(F’) = @|F] for each F € Y(X). As a consequence, @ 
induces a one-one correspondence between closed subsets of X and closed subsets 
of Y. Hence @ is ahomeomorphism. This proves Lemma 9.5. 

The special case of Theorem 9.4 where A and B are commutative now follows if 
we combine all steps so far: 


1. The Gelfand isomorphism allows us to assume A = C(X) and B = C(Y), as above. 

2. The order isomorphism B : @(A) — @(B) determines an order isomorphism F : 
$(X) > §(Y) of the underlying lattices of u.s.c. decompositions, and vice versa. 

3. Because of (9.6), the order isomorphism F in turn determines and is determined 
by an order isomorphism F : .¥2(X) > -F2(Y). 

4. Lemma 9.5 yields a homeomorphism @ : X — Y inducing F : ¥2(X) > F2(Y). 

5. The inverse pullback (g~!)* : C(X) — C(Y) is an isomorphism of C*-algebras, 
which (running backwards) reproduces the initial map B : @(C(X)) > @(C(Y)). 
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Therefore, in the commutative case we apparently obtain rather more than a weak 
Jordan isomorphism J : Ag, > Bsa; we even found an isomorphism J: A — B of C*- 
algebras. However, if A and B are commutative, the condition of linearity on each 
commutative C*-subalgebra C of A includes C = A, so that (after complexification) 
weak Jordan isomorphisms are the same as isomorphisms of C*-algebras. 

We now turn to the general case, in which A and B are both noncommutative (the 
case where one, say A, is commutative but the other is not, cannot occur, since @ (A) 
would be a complete lattice but @(B) would not). Let D and E be maximal abelian 
C*-subalgebras of A, so that the corresponding elements of @ (A) are maximal in the 
order-theoretic sense. Given an order isomorphism B : @(A) > @(B), we restrict the 
map B to the down-set | D = @(D) in @(A) so as to obtain an order homomorphism 
Bip : @(D) + @(B). The image of @(D) under B must have a maximal element 
(since B is an order isomorphism), and so there is a maximal commutative C*- 
subalgebra D of B such that Byp : @(D) > @(D) is an order isomorphism. Applying 
the previous result, we obtain an isomorphism Jp : D + D of commutative C*- 
algebras that induces B)p. The same applies to E, so we also have an isomorphism 
Je: E — E of commutative C*-algebras that induces Biz. Let C = DNE, which lies 
in @(A). We now show that Jp and Jz coincide on C. There are three cases. 


1. dim(C) = 1. In that case C = C- Ly is the bottom element of @(A), so it must be 
sent to the bottom element C = C- 1g of @(B), whence the claim. 

2. dim(C) = 2. This the hard case dealt with below. 

3. dim(C) > 2. This case is settled by the uniqueness claim in Lemma 9.5. 


So assume dim(C) = 2. In that case, C = C*(e) for some proper projection e € 
P(A), which is equivalent to C being an atom in @(A). Recall that all our C*- 
algebras are unital, and that by assumption C*-subalgebras C share the unit of 
the ambient C*-algebra A, hence C*(e) contains the unit of A. Hence C = B(C) = 
Bip(C) = Biz(C) is an atom in @(B), which implies that C = C*(@) for some pro- 
jection é € A(B). If Jn(e) = Jze(e) we are ready, so we must exclude the case 
Jn(e) =é, Jz(e) = 1g —@. This exclusion again requires a case distinction: 


dim(eAe) = dim(e+Ae*) = 1; (9.16) 
dim(eAe) = 1, dim(e+Aet) > 1; (9.17) 
dim(eAe) > 1, dim(e~Ae~) > 1, (9.18) 


where e+ = 1,4 —e. Each of these cases is nontrivial, and we need another lemma. 
Lemma 9.6. Let C € @(A) be maximal (i.e., C C A is maximal abelian). 


1. For each projection e € P(C) we have dim(eCe) = 1 iff dim(eAe) = 1. 
2. We have dim(C) = 2 iff either A = C? or A = M2(C). 


Proof. For the first claim dim(eAe) = | clearly implies dim(eCe) = 1. For the con- 
verse implication, assume ad absurdum that dim(eAe) > 1, so that there is ana € A 
for which eae £ A -e for any A € C. If also dim(eCe) = 1, then any c € C takes the 
form c = “-e+e+ce+ for some p € C. Indeed, since c,e,e+ commute within C, 
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c=ce+cet =ce’+c(et)? =ece+etcet = petetcet, (9.19) 


where the last equality follows since ece € eCe, which is spanned by e. This implies 
that eae € C’ (where C’ is the commutant of C within A), and since C is maximal 
abelian, we have C = C’, whence eae € C. Now eae = e(eae)e, hence eae € eCe, 
whence eae = A -e for some A € C. Contradiction. According to Theorem C.169.1, 
the assumption dim(C) = 2 implies that A is finite-dimensional, upon which Theo- 
rem C.163 and (C.641) yield the second claim. 


Having proved Lemma 9.6, we move on the analyze the cases (9.16) - (9.18). 


e Eq. (9.16) implies that C is maximal, as follows. Any element a € A is a sum 
of eae, etae+, eae, and etae; nonzero elements of C’ = {e}' can only be of 
the first two types. If (9.16) holds, then dim(C’) = 2, but since C is abelian we 
have C CC’ and since dim(C) = 2 we obtain C’ = C. Lemma 9.6.2 then implies 
that either A ~ C? or A & M>(C). These C*-algebras have been analyzed after 
the statement of Theorem 9.4, and since those two A’s conversely imply (9.16), 
we may exclude them in dealing with (9.17) - (9.18). By Lemma 9.6.2 (applied 
to D and E instead of C), in what follows we may assume that dim(D) > 2 and 
dim(E£) > 2 (as D and E are maximal). 

e Eq. (9.17) implies dim(eD) = 1. Assuming Jp(e) = é, this implies dim(éD) 
(since Jp is an isomorphism). Applying Lemma 9.6.1 to B gives dim(éBé) 
(since D is maximal). If also dim((1g — @)B(1g — é)) = 1, then dim(D) = 2, 
whence dim(D) = 2, which we excluded. Hence 


=1 
=1 


dim((1g —2)B(1g —é)) > 1. (9.20) 


Applied to Jz this gives Je(e) = é, and hence Jp and Jg coincide on C = C*(e). 
e Eq. (9.18) implies that dim(eDe) > 1 as well as dim(e+Ee+) > 1 (apply Lemma 
9.6.1 to D and E). Since dim(eDe) > 1, there is some a € D such that e and 
a’ = eae € D are linearly independent, and similarly there is some b € E such 
that b! = e+be+ is linearly independent of e+. Then a’,b’,e commute (in fact, 
a'b' = b'a' = 0), so that we may form the abelian C*-algebras C; = C*(e,a’) CD 
and C> = C*(e,b’) C E, which (also containing the unit 14) both have dimension 
at least three. We also form C3 = C*(e,a’,b’), which contains C, and C) and 
hence is at least three-dimensional, too. Because D and E are maximal abelian, 
C3 must lie in both D and E. Applying the abelian case of the theorem already 
proved to D and E, as before, but replacing C used so far by C3, we find that Jp 
and Jz coincide on C3 (as its dimension is > 2). In particular, Jn(e) = Jz(e). 


To finish the proof, we first note that Theorem 9.4 holds for A = B = C by in- 
spection, whereas the cases A ~ B & C? or = M(C) have already been discussed. 

In all other cases we define J : Asa + Bsa by putting J(a) = Jn(a) for any max- 
imal abelian unital C*-subalgebra D containing C = C*(a) and hence a; as we just 
saw, this is independent of the choice of D. Since each Jp is an isomorphism of 
commutative C*-algebras, J is a weak Jordan isomorphism. Finally, uniqueness of 
J (under the stated restriction on A) follows from Lemma 9.5. 
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Theorem 9.4 begs the question if we can strengthen weak Jordan isomorphisms 
to Jordan isomorphism (i.e. invertible linear maps that preserve the Jordan product, 
cf. Appendix C.25). This hinges on the extendibility of weak Jordan isomorphisms 
to linear maps (which of course continue to preserve the Jordan product and hence 
are automatically Jordan isomorphisms). A general result in this direction is: 


Theorem 9.7. Let A and B be unital AW*-algebras, where A contains no summand 
of type 12. Then there is a bijective correspondence between order isomorphisms 
B:@(A) > @(B) and Jordan isomorphisms J : Asa > Bsa- 


This follows from Gleason’s Theorem for AW*-algebras, which we will neither 
state nor prove. If A = B = B(#), then the ordinary Gleason Theorem suffices to 
yield the crucial lemma for Wigner’s Theorem for Bohr symmetries (i.e. Theorem 
5.4.6): 


Lemma 9.8. Let H be a Hilbert space of dimension greater than two. Then any Bohr 
symmetry of @ (B(H)) is induced by a Jordan symmetry of B(H)sa. 


Proof. This follows from Theorem 9.4 and Corollary 5.22, which for the case at 
hand turns weak Jordan isomorphisms into Jordan isomorphisms. 


We finally turn to symmetries of projection lattices. Theorem C.174 shows that 
for von Neumann algebras (and more generally for AW*-algebras) A (without sum- 
mand of type Iz) and B, any isomorphism N : (A) + A(B) of the correspond- 
ing orthocomplemented projection lattices (which automatically preserves arbitrary 
suprema) is the restriction of a unique Jordan isomorphism J : As, + Bea. 

This completes the argument to the effect that for many C*-algebras of observ- 
ables A (including B(H) for dim(H) > 1 as far as nos. 1-4 are concerned, and having 
dim(H) > 2 if we also include nos. 5—6) our six seemingly different notions of sym- 
metry of a quantum system described by a C*-algebra are equivalent. In particular, 
they are equivalent to Jordan isomorphisms, which are also the easiest ones to use, 
as they involve a readily identifiable part A,, of A, and (by complexification, as ex- 
plained above) may even be defined on A itself (namely as those complex-linear 
isomorphisms that preserve the involution * as well as the Jordan product 0). 

Putting B = A and assuming (without loss of generality) that A C B(H), Theorem 
C.175 then yields a separation of Jordan automorphisms into three disjoint classes: 


Corollary 9.9. Jf J is a Jordan symmetry of a unital C*-algebra A C B(H), then 
there are three mutually orthogonal projections e\, é2, e3 in A'N A" such that: 


1. ey tex +e3 = 1p; 

2. The map a++ J(a)e, from A to B(e,H) is a homomorphism (of C*-algebras); 

3. The map a+ J(a)e from A to B(e2H) is an anti-homomorphism (ibid.); 

4. The map a++ J(a)e3 from A to B(e3H) is both a homomorphism and an anti- 
homomorphism of C*-algebras (so that the “corner” J(A)e3 is commutative). 


If in addition a> J(a)e, is not an anti-homomorphism and a++ J(a)e2 is not a 
homomorphism, then e1, e2, and e3 are uniquely determined by these conditions. 


As we shall now see, if the symmetries form a (Lie) group, then this result often 
justifies restricting our attention simply to homomorphisms of C*-algebras. 
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9.2 Unitary implementability of symmetries 


There are good reasons for the dichotomy (or even trichotomy) between homo- 
morphisms and anti-homomorphisms of C*-algebras left by Corollary 9.9, since 
in physics certain discrete symmetries of quantum theory indeed give rise to anti- 
homomorphisms: the best-known examples are time inversion T and charge con- 
jugation C combined with space inversion (i.e. parity) P, giving CP (there are also 
other examples in condensed matter physics, like quantum spin flip). However, for 
the kind of problems mainly addressed in this book it is sufficient to restrict our 
attention to homomorphisms. One reason is that even if we use discrete symmetries 
(where the simplest non-trivial group Z2 often suffices to make our point), the mod- 
els we treat simply realize these symmetries as homomorphisms. Another reason 
is that if symmetries join to form a connected topological group G (typically a Lie 
group) and the maps x ++ J, sending x € G to some Jordan symmetry J, of the given 
C*-algebra A of observables form a (strongly) continuous homomorphism (see be- 
low), then the identity e € G must be mapped to the identity id4, which of course is 
a homomorphism of A. Continuity then implies that all J, must be homomorphisms. 

In what follows we therefore assume that G is a (topological) group and that we 
are given a (continuous) homomorphism x +> from G into the group Aut(A) of all 
automorphisms of A; note that, given our restriction to homomorphisms, we switch 
notation from J to the customary symbol @. Continuity here always means strong 
continuity, in that for each a € A the map x > (a) from G to A is continuous (so 
that the map G x A > A given by (x,a) +> O(a) is continuous, as usually required 
for group actions in a topological setting, cf. Proposition 5.35). 

It follows from Theorem 5.4 (technically, from part 4 of that theorem, but 
“morally” from all of it, including the equivalences between all kinds of symmetries) 
that if A = B(H), then a homomorphism @ : G > Aut(B(#)) is always implemented 
by a family u(x) of unitary operators on H, in that 


O(a) = u(x)au(x)* (x € G). (9.21) 


The group representation property 0 = Oy does not enforce u(x)u(y) = Uxy! 
indeed, as we saw in detail in §5.10 one may have a projective unitary representation 
g +> u(x) of G on H. However, by Theorem 5.62 one may usually pass to a central 
extension G of G for which this problem does not arise (e.g., SO(3) = SU(2)). In 
Corollary 9.12 below (unbroken symmetry), even such a passage is not necessary. 
For general C*-algebras A—especially those modeling either classical systems 
(in which case A is commutative) or infinite quantum systems (where A is typically 
an infinite tensor product), one rarely has o&(a) = uau* for some u € A even for 
single automorphisms @, let alone for a whole group of them. Instead, we settle for 
a weaker notion of unitary implementability, where the unitary u need not be in A. 


Definition 9.10. Let z : A > B(H) be a representation of A. An automorphism & € 
Aut(A) is implemented in H if there exists a unitary operator u: H — H such that 


m(a(a)) =unm(a)u* (ac A). (9.22) 
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The fundamental criterion for implementability uses the pullback a* : S(A) > S(A) 
of a@: A — A to the state space S(A), defined by a*@ = wo a!; cf. §C.25. 


Theorem 9.11. An automorphism a: A — A can be implemented in the GNS-repr- 
resentation Tw defined by a state @ on A iff To*@ and Nw are unitarily equivalent. 


Proof. Whether or not 7o*@ and 7 are unitarily equivalent, we may define 


W: He > Hex (9.23) 


This operator is well defined and unitary, and satisfies wWQg = Qe*@ as well as 
Ww (a)w* = Ta*w(A(a)); these properties even characterize w. If 1¢*@ = Mo, there 
exists a unitary v : Hy > Hye Satisfying vt@(a)v* = Tgs@(a),a€ A. Then u = v*w 
satisfies (9.22) for 7 = 1%. The converse is similar. 


An important special case arise if @ is invariant under a. 


Corollary 9.12. If a*@ = @ (that is, @(a(a)) = @(a) for all a € A), then a is 
implemented by a unitary operator Ug : Hy — Ho Satisfying UpQeo = Qo. In par- 
ticular, given a continuous homomorphism a : G —> Aut(A) such that 0 @ = @ for 
each x € G, one has a family of unitaries g(x) : Hy > Ho that for all x € G satisfy 

Tea (Qx(a)) = Uo(X)Ho(a)ua(x)*, (9.26) 


and form a continuous unitary representation of G on Hq. 


Proof. One easily checks that the following operators do the job: 


Ue(X)Xo(a)Qe = No(O(a)) Qo. 


Given some a@ € Aut(A), a weak form of spontaneous symmetry breaking 
(SSB) is that some state @—it is always a state that breaks a symmetry—satisfies 
a*@ # @; a stronger one states that the two equivalent conditions in Theorem 9.11 
are violated, i.e., that @ cannot be implemented in the GNS-representation (A) 
(cf. Definition 9.10). In order to be physically relevant, the weaker notion has to be 
supplemented with additional structure, which also guarantees that generically the 
weak form implies the strong one. Part of this structure involves the identification of 
suitable classes of states within which we define SSB; these classes are predicated 
on a time-evolution on A. We also need a symmetry group instead of a single auto- 
morphism @& (which implicitly uses the group Z, = Z/p-Z, where p is the smallest 
integer such that a? = idy; if no such p exists the group is just Z). Thus we need: 


e AC*-algebra A with time-evolution, i.e., a homomorphism @ : R — Aut(A); 
e A preferred class of states defines via a, viz. ground states or equilibrium states; 
e Asymmetry group G acting on A via a homomorphismy: G — Aut(A) satisfying 


Ye = Ye% (t €R,g EG). (9.27) 
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9.3 Motion in space and in time 


The C*-algebras A we are going to use are the quasi-local ones introduced in §8.5 
for quantum spin systems; especially recall (8.130). Also, the C*-algebra A = B™ in 
§8.2 is a case in point, but this would require some changes in what follows. The last 
expression in (8.130) is convenient for introducing spatial translation symmetry 


7: Z4 —+ Aut(A) (9.28) 
of Z4, as follows: for x € Z4, define t,: A, — Ay, initially by 
t,(b(y)) = (x+y), (9.29) 


where, for given b € B(H) and y € A, the operator b(y) € Ag is the element ®,<, a; 
with a, = b and a, = 1q whenever z ¥ y. Since arbitrary elements of A, are (norm- 
limits of) finite linear combinations of products of such operators b(y), the automor- 
phic (and hence isometric) property of 7, defines its action on all of A, (if necessary 
by continuous extension). Note that for a € A, the operator 7,(a) thus defined is in- 
dependent of the (typically non-unique) realization of a in terms of the b(y), because 
7, is an isometry. The group homomorphism property of the map (9.28) thus con- 
structed is guaranteed by (9.29), whilst continuity is no issue since Z? is discrete. 

Since A, = ®ycaAy with Ay = B(H), an equivalent way to define 7, is to use 
identifications id,, : Ay + A, (since Ay = A, = B(H)), which, taking tensor products, 
yield isomorphisms id, 4’ : Aq — A,’ whenever some bijection A & A’ is given. 
In terms of those, we simply have (t,)i4, = id,..+,. Either way, the maps (t,)j4 , 
extend to 7, : A — A by continuity. The following property then holds: 


Proposition 9.13. An automorphic action t of Z4 on a quasi-local C*-algebra A is 
asymptotically abelian in the sense that lim,_..|a, T(b)| = 0 for all a,b € A. 


Here x — co means that any sequence (x,) with |x,| — co with respect to the Eu- 
clidean norm on Z“ has a subsequence (x/,) for which the stated result holds. 


Proof. For a and b local, i.e., a € Ayu) and b € Ay.) this follows from Einstein 
locality. The general case follows by approximating a and b by local elements. 


Thus quasi-local C*-algebras A satisfy the assumptions in the following theorem, 
which will be important in linking the various notions of SSB discussed earlier. 


Theorem 9.14. Let A be a C*-algebra A equipped with an asymptotically abelian 
action t of Z4, and let be a translation-invariant primary state on A (i.e., T @ = @ 
forallx € ZA). Then Qi is the only translation-invariant vector in Hw. Moreover, 


lim @(at,(b)) = @(a)@(b) (a,b € A); (9.30) 

tim %o(t(b)) = @(b)- lig (6 A); (9.31) 
lim |A|"' Y to(t(b)) = @(b)- 1H, (b € A). (9.32) 
AtZ4 XEA 
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Here (9.31) and (9.32) hold in the weak operator topology on B(Hq), and the limit 
A ¢ Z4 in is taken along the hypercubes Ay in (8.153) as N > ©. 


Proof. If @ is primary, Theorem 8.23 (or its proof) yields 
lim |o(at,(b)) — (a) 0(%4(d))| = 0. (9.33) 


Translation-invariance of @ then yields (9.30), which also is a lemma for (9.31) - 
(9.32). Towards (9.31) we compute @(at,(b)) in terms of the projection 


= lim |A|7! 9.34 
€0 ao | Ye u(x) ( ) 


xEA 


onto the translation-invariant subspace of Hw, where u is the unitary representation 
of Z4 on Hw from Corollary 9.12 (with G = Z4), and the limit is taken in the strong 
operator topology. Eq. (9.34) is a special case of von Neumann’s L” ergodic theo- 
rem (which generalizes the Peter-Weyl-Schur relation e9 = {,dx u(x) for compact 
groups G to amenable groups like Z4 or R%). Since e9 Qm = Qw, we have 


(aT,(b)) = (Qe, Ko(4)Xo(T(b)) 2a) (9.35) 


= (Qo,%o(a)([Mo(tx(b)),e0] + €o%a(b))Qo). (9.36) 


We now let x > . The commutator then vanishes, because the weak limit of 
To(Tx(b)) lies in the center of %»(A)”, which is trivial since @ is primary. The 
remaining term matches with (9.30) iff eo is one-dimensional, so that Qy is the only 
translation-invariant vector in Hg, and ep = |Qe)(Qoq|. A similar trick then yields 


Te (Tr(b))%a(4) Qo = ([To(T(b)); Zo(@)| + To(4)([%o(t()),€0] + O(b))) Qo. 
Both commutators vanish (weakly) as x — ©, proving (9.31). Similarly, write 


Te (T(D)) Xo (A) Qo = ([o(T(b)), Zo (a)] + To (a)u(x) Ho (b)) Qe, (9.37) 


and use (9.34) and the previous formula for eo to prove (9.32). 


In the C*-algebraic formalism, dynamics is described by a continuous homomor- 
phism @ : R > Aut(A), t+ a. For A = B(H’), where H’ is some Hilbert space (not 
to be confused with our earlier H in the quasi-local setting), Theorem 5.4 yields 


O(a) = u,au; (9.38) 


for some family of unitaries u, = u(t), t € R. Eq. (5.268) and Proposition 5.53 then 
imply that the family u; may be redefined so as to make the map t +> uw; a continuous 
unitary representation of R on H’. Stone’s Theorem 5.73 finally gives the familiar 
expression for time evolution in the so-called Heisenberg picture in terms of the 
Hamiltonian h, which is a (possibly unbounded) self-adjoint operator on H'’, i.e., 


0; (a) = eae #P (9.39) 
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For arbitrary (unital) C*-algebras A one has no counterpart of Theorem 5.4, and 
one cannot rely on Theorem 9.11 either because there are no preferred states to begin 
with; such states typically require a time-evolution for their definition (see below). 
For quantum spin systems (still with H = C” and hence B(H) ~ M,,(C)), one tries to 
construct the map ¢ +> a from local approximations: with A, given by (8.129) with 


(8.128), we pick local Hamiltonians hy € B(H,) and define maps t +> Aut(A,) by 
of (a) = eA ae #ha , (9.40) 


where a € Aq. Letting A 7 Z%, we would then like to assemble the family a into 
a single automorphism group a@ : R — Aut(A), which describes the dynamics of the 
corresponding infinite quantum system. Towards this aim, we start from a potential 
(also called an interaction) ®(X ) € B(Hx), which is defined for any finite sublattice 
X of Z4 , in terms of which the local Hamiltonians hf, take the form 


hy = ¥° ®(X), (9.41) 
XCA 


where the sum is over all sublattices X of A. For nearest-neighbour interactions, 
®(X) is nonzero iff X = {x,y} is a pair of neighbours, and in the presence of an 
external magnetic field one also has terms proportional to &({x}). For example, 
the quantum Ising model is defined by H = C and ®({x,y}) = —Jo3(x)o3(y) for 
nearest neighbours and ®({x}) = —Boj(x) for all x, where J > 0 and B € R. The 
local Hamiltonians are therefore given by 


ha=—J_Y o3(x)o3(y)-—BY’ o1(x), (9.42) 
(xy)EA xEA 


where the sum over (xy) € A denotes summing over nearest neighbours in A. The 
expression (9.42) implicitly has so-called free boundary conditions, in that only 
neighbours inside A take part in h,. Alternatively, one could use periodic boundary 
conditions, which in d = | define the quantum Ising chain 


N-1 N 

hy =—J ( y (03(x)o3(x+1)+ cxW)ent) —B y 01 (x). (9.43) 
x=1 x=1 

In (9.42) - (9.43) the operators 0;(x) in A, is defined as explained after (9.29). We 

are going to study the quantum Ising chain in detail in connection with SsB; for 

the moment, we just mention another popular spin model, namely the Heisenberg 

model for magnetism. This also has H = C’, but the local Hamiltonians are 


3 
ha=J VY Yayoi), (9.44) 
(xyEA) i=1 


with free boundary conditions, where J < 0 ( J > 0) yields (anti) ferromagnetism. 
Although we do not have (9.38) for any u, € A, we may construct a, as follows. 
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Theorem 9.15. Let ® be a short-range potential in that there is r © N such that 
®(X) £0 only if |x —y| <r for all x,y € X, and define local Hamiltonians ha by 
(9.41). For fixed finite A C Z4 and a € Ag, the following (norm) limit exists and 
defines an automorphism oO, of Ux -zaAa and hence by continuity also of A: 


o,(a) = lim eA ge~ hy , (9.45) 


N- 00 


Proof. Note that for large enough N, the hypercube Ay contains any A € P;, (ZA). 
Take a € Ay, take Ay, D Ay, > A, and use (9.40) and (9.41) to compute 


Any) 


Jor™(a)— ay" (a) = fds (04 00,22" (@))| 
=| [4 deemed 


t A 
=| ff asad hag, <4” Co) 


t (An, ) (An, ) 
< f ds|lay? (hay hay, 04-2” @)D)I 


t 
(An, ) 
< if ds | Aan, —hay, oe 


=fa] DL Liewat@ 


xeAy, \An, X5x 


Xf asiiee).a2 @)l. (9.46) 


eke a. XDx 


We now show that the left-hand side of the first line is a Cauchy sequence. Since 


of") (a) _ go) Lycay, (Y) ~i(t-s) Yycay, PY) 


ae € B(Hay, ); (9.47) 


which is finite-dimensional (as Ay, is finite), we have a norm-convergent expansion 


oN) \= +it y [B(Y. (it)? 
(OW) (@) ati at SP [w(%), (6%), a) +--- 0.48) 
Y, CAn, * YYoCAny 


Let A(r) consist of all y € Z for which there is some x € A for which |x—y| <r. 
Then the zeroth term a in (9.48) is in Ag, the first is in A, over the n’th is in 
A, (nr): Therefore, we can find n = n(N1,N2,3) such that the only terms in (9.48) that 
contribute to the commutator in (9.46) are the n’th and beyond. Taking Ay, and Ay, 
large enough, this tail can be made arbitrarily small, so that (a(4” ) (a))y is a Cauchy 
sequence in A. This gives convergence of (9.45) for a € Ay, where A is arbitrary 
(but finite), yielding an automorphism a in U,A,. Being an automorphism, 0; is 
isometric, so that it extends to A by continuity. 
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9.4 Ground states of quantum systems 


A ground state of a finite system A, = B(H,) is an eigenstate of the local Hamil- 
tonian ha with the lowest eigenvalue; because dim(H, ) < ©, the spectrum of ha is 
discrete and hence local ground states exist. For infinite systems, no Hamiltonian is 
yet defined, so we need to define ground states in terms of the dynamics Q;. 


Definition 9.16. Let A be a C*-algebra with time evolution, i.e., a continuous ho- 
momorphism a :R — Aut(A) (which gives the dynamics of the underlying physical 
system). A ground state of (A, 0) is a state @ on A such that: 


1. @ is time-independent, i.e. 0, @ = @ (or @(0;,(a)) = @(a) for alla € A) Vt ER; 
2. The generator hw of the ensuing continuous unitary representation 


trou, =ellto (9.49) 
of R on Hg has positive spectrum, i.e., O(h@) C Rt, or equivalently, 
(Wi how) = 0 (We D(ho)). (9.50) 


Note that the existence of the operator hg is guaranteed by Corollary 9.12 and the 
arguments after (9.38). Since Corollary 9.12 yields 


Teo(0;(a)) = ef ny (aye t", (9.52) 


it follows that hg is a Hamiltonian in the usual sense, implementing the Heisenberg- 
picture time evolution (albeit in the representation %»(A) rather than in A itself). 
Moreover, in view of (9.51) and the assumed positivity of o(/@), the unit vector 
Qq@ of the GNS-representation 2» induced by a ground state @ is a ground state 
for the Hamiltonian hg in the usual sense. If @ is pure (see below for a discussion 
of this desirable possibility), then obviously exp(ithg) € M@(A)”, since the latter 
equals B(Hq). A deep result states that this is always the case (Borchers Theorem): 


Theorem 9.17. /f @ is a ground state on A, then exp(ithg) € Mw(A)" for allt ER. 


As we shall see, this contrasts with equilibrium states. The Heisenberg equation of 
motion for operators a(t) has a counterpart in the C*-algebraic formalism, which 
requires a concept already encountered in 83.1, but repeated here for convenience: 


Definition 9.18. A derivation on a C*-algebra A is a linear map 6 : A> A with 
6(ab) = 6(a)b+ad(b), (a,b € A) (Leibniz rule). (9.53) 


An unbounded derivation is a linear map 6 : Dom(d) > A, where the domain 

Dom(6) CA of 6 is a dense linear subspace of A, that satisfies the Leibniz rule. 
An (unbounded) derivation 6 is symmetric when 6(a*) = 6(a)* for all a (in 

Dom(6), which must be self-adjoint in that a € Dom(6) iff a* € Dom(6)). 
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Bounded derivations are rare in classical physics; nonzero derivations of A = Co(R@) 
do not even exist, but it has plenty of unbounded derivations, viz. 6(f) = €f for 
some vector field € on R@. In quantum mechanics, A = B(H') does have derivations, 
all given by 5(a) = i[h,a] for some bounded (self-adjoint) operator / on H’. 

Proposition 9.19. Any continuous homomorphism a : IR — Aut(A) on any C*- 


algebra A defines an unbounded symmetric derivation 6 on A by the norm limit 


6(a) = “a, (a)ja0 = lim HO (9.54) 


where Dom(6) consists of alla € A for which this limit exists. Moreover, this domain 
is stable under Q% in that if a € Dom(6), then a,(a) € Dom(6) (t € R). 


The proof is an elementary verification (cf. Theorem 5.73). On Hw we then have 
Mo(5(4)) = ilho, To(a)], (9.55) 


which, then, is “Heisenberg’s equation of motion revisited.” One may also reformu- 
late Definition 9.16 in terms of the derivation 6 associated to a@ by (9.54): 


Proposition 9.20. A state @ € S(A) is a ground state for given dynamics a iff 
—i@(a*d(a)) >0 (a € Dom(6)). (9.56) 


Proof. If @ is a ground state according to Definition 9.16, we may use (9.55), 
(C.196), (9.51), and finally (9.50) to compute 


—iw(a*5(a)) = —i(Qo,%o(a*5(a)) Qo) = (Qo, To(a)* [ho, Ko (a)|Qo) 
= (To(a)Qo,hoKo(a) Qa) = 0. (9.57) 


Conversely, we first show that if @ satisfies (9.56), then it is @-invariant. We initially 
assume a = a*, so that 6(a)* = 5(a*) = 6(a), as 6 is symmetric by construction. 
Since @ is a state, one has @(b*) = w(b) for any b € A, so taking b = 8(a)a, using 
(9.56) just in that w(a*d(a)) € iR, we obtain @(5(a)a) = —@(ad(a)). Hence 


w(65(a’)) =0, (9.58) 


by (9.53), so also @(5(a@;(a))) =0, s € R. With (9.54), we find 


an (a,(a)*)) = Pie be, 
= [ass ds < 0(044s(a)2))) yao0= f dso @(a,(a)2)) = @(c%,(a2)) — (a2). 


Hence @(0,(a”)) = @(a”) for each u > 0 (and analogously for each u < 0), when- 
ever a* =a, i.e., @(a,(b) = @(b) for each b > 0. But any b € A may be written as 
a sum of at most four positive elements, so @ 0 @, = @ for all u € R. We therefore 
have a Hamiltonian hg, whose positivity follows from (9.57), ran backwards. 
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9.5 Ground states and equilibrium states of classical spin systems 


Thermal equilibrium states are arguably physically more relevant than ground states, 
as the latter rely on the idealization of temperature zero. Since in statistical mechan- 
ics infinite systems are used to approximate very large ones, it will be of particular 
interest to define equilibrium states in infinite volume. If only to highlight contrasts 
with quantum theory, we take a long run and start with the classical case. 

Classical spin systems on a lattice are defined by a single-site configuration space 
n= {0,1,...,n}, where m € n may either be interpreted as some spin-like degree 
of freedom (as in the Ising model, where n = 2) or as the number of (structureless) 
particles occupying a given site (in which case one has a lattice gas). As in (C.310), 
for any finite sublattice A C Z4, the local algebra of observables is given by 


AY =C(n'), (9.59) 


where n* = C(A,n) consists of all functions s : A — n. For finite A this is a finite 
set (of cardinality nl4l), so that all functions in question are continuous and hence 
C(n“) just stands for the commutative C*-algebra of all functions from n“ to C. If 


A, C A), we have maps ae 2): “AR 5 A. written fit? fo, which are given 
1 


AQ)? 
by 
fr(s) = filsya,)s (9.60) 
where s: A(?) —+ n. As these maps are injective, the ensuing inductive limit is simply 
AO =UnegeAl? 2C(n2"), (9.61) 
where n7 = = |],<zan is endowed with the product topology and hence (by Ty- 


chonoff’s theorem) is compact (for n = 2,d = | this is a model of the Cantor set). 
As in the quantum case, local Hamiltonians are defined via an interaction ®, 


which now is an assignment X ++ ®(X), where X C Z“ is finite and ®(X) € AL), 


If X CY, we regard ®(X) an an element in Av ) through the inclusion Aw) es Awe ) 
indicating this explicitly by writing ®(X)y € Aw ) We then define ha € Aw) by 


ta= VY) P(X)a, (9.62) 
XCA 


where the the sum is over all subsets X of A. For example, the Ising Hamiltonian 


)=-J & SiSj — BY’ (9.63) 


icA 


where the sum is over nearest neighbours in A, and we assume 2 = {—1,1} (rather 
than the usual c-bit {0,1}), comes from the following potential: 


(X) = Oif either |X| > 2 or, if |X| = 2, its elements are not nearest neighbours; 


e D(X) 
e P({i}): 5s —Bs;, and ®({i, j}) : s+ —Js;s; if iand j are nearest neighbours. 


9.5 Ground states and equilibrium states of classical spin systems 353 


As in (9.41), the prescription (9.62) has free boundary conditions, in that it only 
involves spins inside A. Another possibility is to fix a “boundary” spin configuration 


be nZ", and define A € Aw) by 


hh = y &(X)4. (9.64) 
XCZ4,|X|<0,XNA ZO 


This involves some new notation &(X)° ‘4 which means the following. In principle, 


P(X) € Ay is a function on n*. We now turn @(X) into a function ®(X)% on n* 


(so that ne is a function on n“ as required): for given s: A  nand givenb: Z4 +n 
we define if :X +n by putting s’ = s on XNA and s’ = b on the remainder of X 


(which is X NA‘, with AS = Z4\A). Then 
B(X)4 (5) = O(X)(s'). (9.65) 


Physically, this simply means that those spins outside A that interact with spins in- 
side A are set at a fixed value determined by the boundary condition b. For example, 
consider the Ising model in d = 1. If we take A = {2,3}, then from (9.62) we obtain 
ha = —Js283 — B(s2 +53); spins outside A do not contribute. From (9.64), on the 
other hand, we obtain he =h, —J(b1s2 +53b4). Although the boundary condition 
b is arbitrary, one may dink of simple choices like b; = 1 or —1 for each i. 

We may actually rewrite (9.64) as a difference between Hamiltonians with free 
boundary conditions. To do so, for given finite A we pick some finite A’ > A large 
enough that it contains all spins outside A that interact with spins inside A (provided 
this is possible). With the conventional notation h, (s|b) = h) (s), this yields 


ha(s|b) = hyi(s,b)—hanalb)= VY P(X) a(8,b)— Yo BV) ana (d). 
X'CA! YCA'\A 


Analogous to (9.65), the notation ®(X’),/(s,b) here means ®(X"),/(s’), for the 
function s’: A’ +n that on A C A’ coincides with s: A > n, whilst on (A’\A) CA’ 
it coincides with the restriction of b to A'\A. Thus we may also write 


ha(s|b) = Bo eae) —hana(d)), (9.66) 


although neither hza(s,b) nor hza\,(b) makes sense by itself. Periodic boundary 
conditions for local Hamiltonians may be defined for arbitrary interactions ® and 
special lattices. For example, the Ising chain in d = 1 has local Hamiltonians 


n—1 n 
WS. we ny (5) =f (ns + ye vse) —BY' 5}. (9.67) 
i=l 


Naively, a ground state of a finite classical spin system, i.e., a system of the 
above kind defined on a fixed finite lattice A C Z/, is a spin configuration sy € n“ 
that minimizes the local Hamiltonian h, (9.62), or its counterpart (9.64), that is, 
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ha(s0) <ha(s); (9.68) 


for all s € n“. For example, if A is a hypercube Ay, then the Ising model (9.63) 
has a unique ground state for B > 0, namely so(x) = 1 for all x € A, whereas it 
has two ground states sj for B = 0, given by sg (x) = £1 for all x. Ground states 
of finite classical systems always exist (since the space on which hy is finite), but 
they are not necessarily unique; we just gave a counterexample! The same is true 
for quantum theory, since for B = 0 also the quantum Ising model (9.42) has two 
degenerate symmetry-breaking ground states. Nonetheless, this case is special, since 
for nonzero small values of B the ground state of the quantum Ising model is unique 
for finite A, whereas on the infinite lattice Z4 it is degenerate (cf. §10.7). 

The definition of ground states of infinite classical spin systems is just slightly 
more involved: for local Hamiltonians ha with free boundary conditions defined by 


an interaction ® a la (9.62), a ground state is a point so € n@" for which 
ha (Soja) <Aa(s\a), (9.69) 
for any finite A C Z4 and any spin configuration s € n=", Alternatively, one may ask 
hy (80) < hy (8); (9.70) 
n= that coincide with sq outside 
A, where hy stands for (9.64) with b = so. In other words, so provides a boundary 
condition b, which is fixed for all s that compete with sg in minimizing the local 
Hamiltonian ne Both definitions give the usual two ground states for the Ising 
model with B = 0 (in which all spins are either “up” or “down”), but the second 
one also opens the possibility of domain walls, where infinite chains of “spin up” 
alternate with infinite chains of “spin down”, and similarly in higher d. 

If different ground states in the above (“pure”) sense exist, we may reinterpret 
such states sq as Dirac measures Ss, on the space nS of all spin configurations on A, 
and may also allow convex combinations of ground states as ground states. This, as 
well as the analogy with Definition 9.16 (in which no purity condition is imposed) 


inspires a more liberal definition of a ground state, which is predicated on Boltz- 
mann’s idea that a state of a classical system of the kind we consider is a probability 


for all finite A C Z4 and all spin configurations s € n 


measure ue on n‘“, and likewise for ne In the C*-algebraic formalism we use, this 
follows from (9.61) and the identification of states on C(X) with completely regular 
probability measures on X (assumed to be a compact Hausdorff space, cf. §B.5). A 
state “ on C (n2"), i.e., a probability measure on n=", induces a state on each local 


algebra C(n‘*), i.e., a probability measure U1, on n“ simply by restriction, since 


Crs) cc") (9.71) 


through the injection (9.60), according to which f, € C(n“) has image f € C (n2") 
defined by f(s) = fa(s\,). The measure j1,, then, is given in terms of ys by 
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Ha(fa) = H(A): (9.72) 


the corresponding probability distribution p, (i.e., pa (s) = Ua ({s})) is given by 
pa(s)=B({s' en™ Ista =s}) ,sen', (9.73) 


The family of probability measures (t1, ) defined by LU is consistent in that if A Ne 
A®) and fi € C(n*"”) and f2 € C(n) are related as in (9.60), then 


Haw) (fi) = Hae (2). (9.74) 


Conversely, a consistent family of probability measures ({1, ) defines a unique prob- 
ability measure on n=" which induces the given family through (9.72). 


Definition 9.21. For given finite A C Z4, a probability measure pe on n* 


n” is a 
ground state of a local Hamiltonian ha (with free boundary conditions) if, in terms 


of the probabilities p° (s) = wu ({s}), for any probability measure UL, on n‘, 


Y pPr(s)ha < ¥ pa(s)ha. (9.75) 
sen sen’ 


Z isa ground state for some interaction ® if (9.75) 


A probability measure Up onn 
holds for any probability measure | on nz" and any finite subset A C Z4, where this 


time D. (and analogously p,) is defined by (9.73). 


In particular, convex sums of pure ground states are ground states in this more gen- 
eral sense, so that, if all pure ground states break some symmetry (as is the case 
for the Z2-symmetry s ++ —s of the Ising model at B = 0), symmetric convex sums 
will restore the symmetry. The set of all ground states of a given interaction @ is a 
convex set, whose extreme points are the pure ground states (at least, under suitable 
hypotheses on ®). This leads to a discussion of SSB similar to the quantum case. 


In the following discussion of equilibrium states, we use the notation 
Pr(X) = S(C(X)) (9.76) 


for the compact convex set of all completely regular probability measures on X, 
which as above will either be the finite set n (with discrete topology)—on which of 
course any probability measure is completely regular—or the compact space ne In 
the first case we may as well use probability distributions p, (instead of probability 
measures) on n“. In the second, we could also use Baire measures. 

Given an interaction ® and the ensuing family (9.62) of local Hamiltonians ha, 
we define the local energy for each finite A C Z¢ as a function & : Pr(n“) > R by 


Ex(pa) = Y- pa(s)ha(s). (9.77) 


sen 
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Of course, this is just the expectation value of the Hamiltonian in the state p,. The 
local entropy S, : Pr(n“) —> R is a more subtle concept; rather than the expectation 
value of some (local) observable, it specifies a property of the probability distribu- 
tion itself. With Boltzmann’s constant kg, we have 


Sa(pa) =—ke Y pa(s)In(pa(s)). (9.78) 


sen 
Note that S,(p,) > 0, with equality iff p, is a pure state (i.e., p, is supported at a 


single spin configuration). The local free energy F B Pr(n“) — R is defined as 


FS = & -TSa, (9.79) 


where B = 1/kgT. A local equilibrium state, then, is a probability distribution ph 


that minimizes the free energy (for fixed temperature 7). 


Theorem 9.22. For each T > 0, there is a unique local equilibrium state, given by 
the Boltzmann distribution (and associated partition function) 


ph(s) = (ZR) 1ePaa), (9.80) 
ZB Ye Blas). (9.81) 
sien 


The associated free energy in equilibrium is then given by 
FP = #8 (pF) — Binz? (9.82) 
Proof, The claim follows from the fact that any pa € Pr(n“) satisfies the inequality 
F# (pa) > —B'nZ8, (9.83) 
with equality iff p = ph, i.e., using (9.79), (9.77), and (9.78), we need to show that 


Y p(s)(ha(s) +B! In p(s)) +B nz’ > 0. (9.84) 


scEA 
Using (9.80), for each s € EA“ we obtain 
—Bha(s) =1nZ* +1n 8 (s). (9.85) 


Substituting this in (9.84), using Y, p(s) = 1, omitting the ensuing prefactor B~!, 


and noting that ph (s) > 0 for all s, the inequality (9.84) to be proved becomes 


Ye p(s) In ( 2g) >0. (9.86) 


scEA 
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Hence we need to prove the inequality 


zoo ($A)o(g)== 


scEA A A\S 
with equality iff p(s) = ph (s) for all s. Let us note that the function f(x) = xInx is 
strictly convex for all x > 0, that is, for any finite set of numbers p’(s) € (0,1) with 
Y, p’(s) = 1 and any set of positive real numbers (x;); > 0, we have 


LP's (xs) > f (Z's) ; (9.88) 


with equality iff all numbers x, are the same. Applying this with p’(s) = ph (s) and 


t= p(s)/ph (s), so that p'(s)xs = p(s) and hence Y, p’(s)xs = Y, p(s) = 1, which 
makes the right-hand side of (9.88) vanish since In(1) = 0, finally leads to (9.87). 


Equality arises iff p(s)/ ph (s) equals the same numer c for all s; summing over all s 


forces c = 1, so that one has equality iff p(s) = pb (s) for all s, as desired. 


Neither the local Hamiltonians (9.62) nor the local partition functions (9.81) have 
a limit as A t Z“. A precise definition equilibrium states of infinite classical systems 
was given in 1968 by Dobrushin and by Lanford and Ruelle (DLR). 


Definition 9.23. For fixed inverse temperature B © (0,°°) and fixed interaction ®, 


a Gibbs measure ma is a (Baire = regular Borel) probability measure on n= such 


that for each finite A C Z4 and each pair (s,b) of a spin configuration s : A + n plus 
boundary condition b : A° — n, the conditional probability 8 (s|b) for the events 


s= {si en™ |, =s} on™; (9.89) 


b= {s" en™ | se = db} Cn™, (9.90) 


is given in terms of the local Hamiltonian ha (s\|b) as defined by (9.66) by 


uF (s|b) = (Zh (b)) te Phase), (9.91) 
ZB (b) = Y, e BhaGsl), (9.92) 
sen 


Recall that 8 (s|b) = w8(sMb)/uF(b), where sb = {s,} consists of the single 
spin configuration s, : Z¢ —+ n that coincides with s on A and coincides with b on 
A‘. Thus we may write 9 (s|b) = p?(s,)/u?(b), where p8(s) = uP ({s}) as usual. 

It was initially unclear how to generalize this highly fruitful definition of equilib- 
rium states in classical statistical mechanics to the quantum case, where conditional 
probabilities are not well defined (this was eventually resolved, however, through 
Definition 10.9 below). Thus a different (equally fruitful) approach to equilibrium 
states of (infinite) quantum systems was developed, to which we now turn. 
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9.6 Equilibrium (KMS) states of quantum systems 


For finite quantum spin systems we have expressions for the energy éb , the en- 
tropy S,, and the free energy Fx that are analogous to their classical counterparts 
(9.77), (9.78), and (9.79). In particular, these quantities are functions on the state 
space S(A, ). Since A, = B(H,), where we assume that H and hence Hy, is finite- 
dimensional, each state @, € S(A,) is given by a density operator p,, so that 


Ex (@,) = @, (ha) =Tr(paha); (9.93) 
Sa(@a) = —keTr (pa Inpa); (9.94) 
FP = & —TSsy. (9.95) 

B 


Defining a local equilibrium state as a density matrix p, that minimizes the free 
energy (for fixed T), we have the following quantum analogue of Theorem 9.22: 


Theorem 9.24. For each T > 0, there is a unique local equilibrium state oh, viz. 


of (a) = Tr (pha); (9.96) 
pe = (23)-te Baa; (9.97) 
2P ty (e7Pé } (9.98) 


FP = #8 (p8) =-p-!'n28. (9.99) 
Proof. One proof is analogous to the classical case, in that for all p, € F(B(H,)), 
F8 (0,4) > —B-'1n28, (9.100) 


with equality iff p, = pe. This, in turn, follows from the inequality 


Tr(a(Inb—Ina)) < Tr(b—a), (9.101) 


with equality iff b = a, which is valid for matrices a,b for which a > 0 (in the usual 
sense that A > 0 for each A € o(a)) and b > 0 in that A > 0 for each A € o(b). The 


case a= p, and b= pb immediately gives the claim. 


What remains to be done, however, is to define equilibrium states for infinite sys- 
tems. This is achieved through the so-called KMS-condition, which is based on the 
observation that for any a,b € Ay, in terms of (9.40) the state (9.96) satisfies 


of (ay) (a)b) = of (bat), (a)) (t ER). (9.102) 
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B 


Moreover, in finite systems this condition (even at t = 0) fully characterizes @,: 


Proposition 9.25. Let h be a self-adjoint operator on a finite-dimensional Hilbert 
space H', with associated density operator p and (complex) time-evolution given by 


eh 
ge Jl 
P= teh (9.103) 
a,(a) = eae", ce Ca € B(H’), (9.104) 


respectively (the exponentials being defined by a norm-convergent power series). 
Then the associated two-point functions defined by @(a) = Tr (pa) satisfy 


(ab) = a(ba;(a)) (a,b € B(A)). (9.105) 
Conversely, any state for which (9.105) holds for given h and Q, is given by (9.103). 


Proof. Eq. (9.105) follows from (9.103) - (9.104) and cyclicity of the trace, i.e., 
(A.78). Similarly, given non-degeneracy of the Hilbert-Schmidt inner product (B.495) 
on B(H), eq. (9.105) is equivalent to the condition 


hael'o, (9.106) 


pa=e- 
for each a € B(H’). Multiplying with exp(h) shows that exp(h)p commutes with 
every a € B(H’). Since B(H’)' = C- 1, we obatin exp(h)p = A - 1. Since exp(h) 
is invertible with inverse exp(—h), we obtain p = A -exp(—A), upon which the nor- 
malization condition Tr (p) = 1 yields (9.103). 


For arbitrary C*-algebras A with time-evolution ¢ ++ @, expressions like 0, jg (a) 
may not be defined, so one has to proceed more carefully, but the idea is the same. 


Definition 9.26. Let A be a C*-algebra with an automorphism group R. A KMS 
state at “inverse temperature” B © R is a state @ on A with the following property: 


1. For any a,b € A, the function F, :t + @(b0;(a)) from R to C has an analytic 
continuation to the strip 


Sg ={z€ C|0< Im(z) < B}, (9.107) 
where it is holomorphic in the interior and continuous on the boundary 
0.7, =RU(R+ iB). (9.108) 
2. The boundary values of Fy» are related, for allt € IR, by 


F350) 


(ba; (a)); (9.109) 
Fy p(t +ip) 3 


(0%;(a)b) (9.110) 


If this is the case, @ satisfies the KMS-condition at (inverse temperature) B. 


=O 
=O 
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It is easy to show that A has a dense subset Ag, such that for any a € Aq the function 
t+ a,(a) from R to A extends to an entire A-valued analytic function, written z > 
a,(a) (i.e., for each @ € A* the function z++ @(a,(a)) from C to C is entire analytic). 
Namely, for any a € A and € > 0, define 


dt i; 
a= | oe PPG (a), (9.111) 


which satisfies ag € Ag and limg|y ag = a. If A = B(H’) with dim(H") < ©, we even 
have B(H’)q = B(H'), since (9.104) is entire analytic in z for any a € B(H’). For 
any A, the KMS-condition on @ is then equivalent to the simpler requirement 


(ab) = (bag (a)) (a € Aa, b € A). (9.112) 


Corollary 9.27. [f A = B(H') with dim(H") < 0°, then KMS states (at fixed B) are 
necessarily given by the equilibrium states of Theorem 9.24 and hence are unique. 


Although initially the characterization of equilibrium states of infinite systems by 
the KMS condition was tentative, in the 1970s and ’80s it became clear that it was 
spot on, being equivalent to local and global thermodynamic stability (against per- 
turbations of the dynamics), the (local) maximum entropy principle, etc. Also: 


Proposition 9.28. A KMS state at B € R\{0} is time-independent. 


Proof. We just sketch the proof if A is unital. Taking b = 14, for fixed a € Aq the 
function Fy1, =F defined by F(z) = @(a,(a)) is entire analytic on C. Writing 
z=t+is (with s,t € R), we have a, = 0 Qj, and hence (since each is an 
automorphism and hence an isometry), |F (t +is)| < ||@is(a)||. Also, (9.112) yields 
F(t+i(s+B)) = F(t+is). Hence F(t +is) is bounded in ¢ and periodic in s; by the 
latter property its supremum on C may be computed by its supremum on the strip 
7g, and by the former property this supremum is finite. Therefore, F is bounded, 
and so by Liouville’s Theorem it must be constant, especially if z=t € R. Hence 
0; @(a) = @(a) for each a € Ag, and since this is a dense set, a'@ = @. 


By the argument for ground states following Definition 9.16, the automorphism 
group f ++ Q; is unitarily implemented in the GNS-representation 7, induced by 
a KMS state @, such that (9.51) - (9.52) hold. However, the operator hg in this con- 
struction should not be confused with the Hamiltonian of the system. For example 
suppose A = B(H’) for some (not necessarily finite-dimensional) Hilbert space H’, 
so that (9.39) holds for some (not necessarily bounded) Hamiltonian h with discrete 
spectrum, such that exp(—B/) € B,(H’). If we now define the density operator 


eh 


then the corresponding state @ satisfies the KMS-condition at B. Generalizing the 
computations around (2.66) in §2.4, we then find (up to unitary equivalence): 
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Hw = Bo(H’); (9.114) 

Tw (a)b = ab; (9.115) 
Qe = p'”; (9.116) 
oT (2) mg (a), (9.117) 


where for any a € B(H’), the operator 71/,(a) on B2(H’) is defined by 
!,(a)b = ba. (9.118) 


Note that (9.115) is well defined, since p > 0 and p € B,(H’), whence p!/2 E 
Bo(H'), and hence also ab € B2(H') and ba € B2(H"), since B2(H’) is a two-sided 
ideal in B(H’). If h happens to be bounded, we may therefore write 


ho = To(h) — 1,(h). (9.119) 


Note that the /, term in (9.117) is not needed for (9.52), since [%@(a),7/,(b)] = 0 
for any a,b € B(H’), but it is necessary to secure (9.51). Another feature of this 
example is that the vector Qe is not only cyclic for Z»(B(H")), which it has to be 
by virtue of the GNS-construction, but also separating, i.¢., 1@(a4) Qu = 0 implies 
Tw(a) = 0. In other words, one has @(a*a) = 0 iff a = 0 (which is by no means the 
case for ground states). If dim(H’) <~, this is obvious, because 1(a) Qu = ap!/? 
and pi/ > is invertible. In general, for arbitrary C*-algebras A we have: 


Proposition 9.29. Let @ be a KMS state on A at B € R. Then Q, is both cyclic and 
separating for Mm(A) and hence also for M%(A)" (as well as for Tm(A)’). 


Proof, Since @(a*a) = ||%o(a)Qq\|*, we have w(a*a) = 0 iff 2 (a) Qe = 0, so that 
(4° 0% (4)) = (To(@) Qo, %a(%(4)) Qo) =0 (ER) 


if @(a*a) =0, and hence Fy« q(t) = 0, cf. (9.109). The “edge of the wegde” theorem 
then gives Fy a(z) = 0 for all z € .%g, upon which the KMS-condition gives 


o(aa") = Far a(iB) = 0. 


This means that @(a*a) = 0 iff w(aa*) =0, or 2 (a) Qo = 0 iff To(a)* Qo =0, and 
hence 1% (b*) Ho (a) Qo = 0 iff To (a*) Xo (b) Qe = 0. Since Qo is cyclic for T»(A), 
the assumption 2(a)Q = 0 therefore implies that the bounded operator 2@(a*) 
vanishes on a dense domain in Hw and hence vanishes. Since 1@(a) = (%(a*))*, it 
follows that 2@(a) = 0. The extension to 7@(A)” (and 2(A)’) is obvious. 


Corollary 9.30. [f @ is a KMS state on a quasi-local algebra A, i.e., given by (8.130) 
with dim(H) < °%, then @(a*a) = 0 iff a=0 and hence the GNS-representation 
To 1A — B(Hq) is injective. 


Proof. By the previous proof, the closed left-ideal (C.204) is actually a two-sided 
ideal, which must be zero, since A is simple (as is easily shown from the simplicity 
of B(H) for finite-dimensional H, cf. §8.5). 
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Proposition 9.29 shows that the von Neumann algebra 7@(A)" is in standard 
form (see Definition C.158), so that the KMS condition bring us into the realm of 
the Tomita—Takesaki theory. In particular, Theorem C.159 provides us with another 
time-evolution, namely the one given by the modular group. In the situation of The- 
orem C.159, we take a € Mg and b € M, and compute 


(Q,ba_;(a)Q) = (Q,bAaA~'Q) = (Q,bAaQ) 

= (A'?p*Q,A'2aQ) = (JA'/?2aQ,5A'/?b*Q) 
(SaQ,Sb*Q) = (a*Q,bQ) (9.120) 
= (Q,abQ), (9.121) 


where we used the property A'/?Q = Q as well as anti-unitarity of J, which im- 
plies Jy,J@) = (@, W); these facts follow from the definitions of A and J via S. 
Therefore, the state @ on M defined by @(a) = (Q,aQ) (a € M) satisfies the KMS- 
condition for the modular group at 8 = —1. If, on the other hand, we start with a 
B-KMS state @ on a C*-algebra A with respect to some given time-evolution a, and 
take H = Hy, M = 11(A)”, and Q = Qo, the normal extension of @ to 1»(A)” given 
by (Qq,-Qq) still satisfies the KMS condition with respect to the time-evolution on 
Tw(A)” given by conjugation with exp(ithg), as in (9.52). Comparing the latter with 
the time-evolution on M defined by conjugation with A“ (cf. Theorem C.159) gives 


eltho — p—it/B (9.122) 


since both one-parameter groups of unitary operators satisfy the KMS-condition at 
B, and some time-evolution a, that satisfies the KMS-condition relative to a given 
state @ and inverse temperature B is unique. To see this (barring technicalities about 
unbounded operators that are easily dealt with), take 6 = —1 for simplicity, assume 
a, is conjugation by A” = exp(ith) (i.e., A = exp(h)), and rewrite (9.112) as 


(ab) = (b*Q,AaQ). (9.123) 


This determines (~, A y) between a dense set of vectors @, y, and hence fixes A. 
The operators J and A from the Tomita—Takesaki theory can explicitly be com- 
puted in the example (9.113); the antilinear operator J : B2(H’) — B2(H") reads 


Jb =b*, (9.124) 


so that the isomorphism a +> JaJ between %(A)” = B(H’) (where B(H’) acts on 
Bo(H') by left multiplication) and its commutant 7(A)’ = B(H’) (which copy of 
B(H’') now acts on B2(H") by right multiplication) is given by Ja/b = ba. Further- 
more, the (generally unbounded) linear operator A : B2(H') - B2(H") is given by 


Ab=pbp"', (9.125) 


which strictly speaking is defined as the closure of the expression (9.125) on the 
domain of all b € By(H’) for which bp~!/? € B(H'). 
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Theorem 9.31. For given unital C*-algebra A, dynamics a: R + Aut(R), and in- 
verse temperature B € R, let S B (A) be the compact convex set of KMS states. Then 


d-Sp(A) = Sp(A)NS)(A), (9.126) 


where S,(A) is the set of primary states on A (cf. Definition 8.17). Consequently, 
extreme KMS states at fixed inverse temperature B are either equal or disjoint. 


This suggests that extreme KMS states define pure thermodynamics phases. 


Proof. We enlarge Sg (A) to the set Kp (A) C A* of all continuous linear functionals 
on A that satisfy the B-KMS condition (so that Sg (A) consists of all positive elements 
in Kp (A) of unit norm). The key to the proof is a bijection between the set S(@) of 
functionals p € Kp (A) for which 0 < p < @, where @ € Sg(A) is fixed, and the set 
T (@) of operators c € H(A)! NM(A)” such that 0 <c¢ < 1y,,, given by 


P (4) = (Qo, CT%o(a) Qa). (9.127) 


This implies the claim, since @ € O.Sg iff any p € S(@) takes the form p = t@ for 
some t € [0,1] (cf. Lemma C.17), which in turn is the case iff c =1-1y,,. 

First, for any state @ € S(A) there is a bijection between the set of linear func- 
tionals p € A* for which 0 < p < @ and the set of operators c € 1@(A)’ such that 
0<c< 1y,, given by (9.127). Indeed, in one direction, given a = b*b > 0, we have 


(@ — p)(a) = (Xw(b) Qe, (1Hy — ¢)%o(b) Qe) > 0, (9.128) 


for if0<c< 1y,, then 0 < (ly, —c) < 1y,. Hence p < @, whilst from (9.127) 
we similarly find p > 0. Conversely, p induces a quadratic form R on Hw, defined 
initially on the dense domain 7(A)H@ by the formula 


R(to(a) Qo, %o(b) Qu) = p(a*b), (9.129) 


which is easily seen to be well defined, positive, and bounded, and so Proposition 
B.79 supplies the operator c, which a simple computation shows to be in %@(A)’. 

For the bijection S(@) = T(@), where @ is a B-KMS state as above, we therefore 
need the additional property c € %»(A)". Putting B = —1 for convenience and using 
the notation of Theorem C.159, we first show that A~"cA*” =c for any t € R: indeed, 
since p satisfies the KMS condition, it is time-translation invariant, so that 


(T@(a*) Qo, A" cA" te (b) Qa) = (Qo,cA" te (a)A" A~" 19 (b)A~" Qe) 
(Q6,CTa(O;(ab)) Qo) 


p(a,(ab)) = p(ab) 
(Ta(a")Qo,CTo(b) Qo), 


so that A~“cA™ = c between a dense set of states, and hence this is valid as an 
operator equation. This also implies that c commutes with any power of A. Define 
c' =JcJ, which by Theorem C.159 is an element of 7(A)”, and compute 
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(Qe, %o(a)c Qe) = (Qe, To(a)IcA? Qe) = (Qe, %e(a)JA"?cQe) 


= (Qo, Mo(a)ScQa) = (Qa, Ra (a)c* Qo) 
= (Qo, %o(a)cQo) 
= p(a), (9.130) 


where we used the properties JQg = Qo, Al?2Q, = Do, cAl/2 = A'/2¢ as just 
mentioned, S=JA t/ 2 andc*=c (since c > 0). Finally, it follows from the KMS con- 
dition (applied to the normal extension of the state @ to 1@(A)” given by (Qy,-Qqu) 
as well as to the normal extension of Pp to %@(A)” given by (Qg,-c'Qe) just com- 
puted) that c’ € 1@(A)’, since for arbitrary a,b,d € Ag we have 


w(ac'bd) = w(a;(bd)ac’) = p(a;(bd)a) = p(a;(b)a;(d)a) 
= p(a;(d)ab) = w(a;(d)abc’) = w(abc'd). 


In other words, for any a,b,d € A we have 


so that c’t@(b) = Tw(b)c’ between vectors in a dense domain, so that this is an 
operator equality. Hence c’ € 1(A)’, and in view of this we may rewrite (9.130) 
as P(a) = (Qg,c' Ho(a)Qoe). Since the operator c’ € Tm(A)’ in (9.127) is uniquely 
determined by p, this shows that c’ = c. Since we already had c! € 1p(A)”, it follows 
that c € M@(A)’NMw(A)”. 


It can also be shown that Sg (A) is a (Choquet) simplex, which is a property rather 
more typical of the state space of a commutative unital C*-algebra; this makes it 
especially remarkable for the set of B-KMS states on a highly non-commutative C*- 
algebra like the infinite tensor product of B = M,,(C). In the physically relevant case 
where Sg (A) is metrizable, this implies that for any given KMS state @ € Sg (A) there 
is a unique probability measure on 0,Sg (A), such that for each a € A, 


o(a) = [ say Ho’) al) (9.132) 


Conversely, any probability measure 1 on 0.5 (A) defines a B-KMS state by reading 
this equality from right to left. Towards the next chapter, suppose for example that 
there is a G-action on A, i.e., a continuous homomorphism y: G + Aut(A) (where 
G isa locally compact group). Then G also acts on S(A) via the dual maps ¥y(@) = 
@oy,-1, and if G is a symmetry of the dynamics in that 0% © Yz = Yz 0 O&% for each 
t < Rand g€G, then this dual action maps both Sg (A) and 0,Sg(A) into themselves. 
If G is compact with normalized Haar measure pL, then for any fixed extremal KMS 
state Wp € 0,5 B (A), by (left) invariance of one obtains a G-invariant state by 


w= | due) yan. (9.133) 
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Notes 


89.1. Symmetries of C*-algebras and Hamhalter’s Theorem 

Theorem 9.4 is due to Hamhalter (2011). Our proof, taken almost verbatim from 
Landsman & Lindenhovius (2016) roughly follow his, but adds various details and 
also takes some different turns. The main differences with the original proof by 
Hamhalter are the following. Firstly, we give an order-theoretic characterization 
of u.s.c. decompositions of the form 7x (and hence of the commutative algebras 
in @(C(X)) that are the unitization of some ideal) by the three axioms stated in 
Lemma 3.1.1 in Firby (1973), whereas Hamhalter uses Proposition 7 in Mendivil 
(1999), which gives a different characterization of unitizations of ideals. Further- 
more, Hamhalter only treats Lemma 9.5 in full generality, whereas in our opinion it 
is very instructive to take the case of finite sets first, where many of the key ideas 
already appear in a setting where they are not overshadowed by topological com- 
plications. Finally, our proof of Lemma 9.6.2 differs from Hamhalter’s proof. The 
topology of partitions may be found in Willard (1970), especially Theorem 9.9. 

Theorem 9.7 is due to Hamhalter (2015). Corollary 9.9 has a long history, starting 
with Jacobson & Rickart (1950) and ending with Thomsen (1982). 


89.2. Unitary implementability of symmetries 

See Bratteli & Robinson (1987), §4.3. 
89.3. Motion in space and in time 

For a far more detailed study of asymptotic abelianness see Bratteli & Robinson 
(1987), §4.3.2 and Bratteli & Robinson (1997), §5.4.1. Results like Theorem 9.14 
may also be found in Sewell (2002). Theorem 9.14 is also valid for ergodic states 
with respect to the given Z4-action, where we say that a state on a C*-algebra A 
with G-action is ergodic if it is an element of 0,(5S(A)°), i.e., extreme in the convex 
set of G-invariant states on A. Also Theorem 9.15 holds (with a more complicated 
proof, of course) under weaker conditions on ®, typically exponential decay in X. 

Theorem 9.15 is the simplest result in this direction; for similar results under 
weaker assumptions on the interaction ®, see Bratteli & Robinson (1997), §6.2.1. 


89.4. Ground states of quantum systems 

The idea of a ground state of a quantum system may be attributed to Bohr (1913), 
who postulated that an atom has a state of lowest energy (which he called a “per- 
manent state”). See e.g. Pais (1986), p. 199. In this section, which merely present 
some key points treated in far more detail in Bratteli & Robinson (1997), §5.3.3. and 
86.2.7, we have just scratched the surface of the topic, which is basic to physics. 


89.5. Ground states and equilibrium states of classical spin systems 

Basic references for the mathematical physics of classical spin systems on a lat- 
tice are Israel (1979), Simon (1993), van Enter, Fernandez, & Sokal (1993), and 
Georgii (2011). One may now define pure thermodynamics phases as extreme el- 
ements of the compact convex set of all Gibbs measures (or of the set of all 
translation-invariant Gibbs measures, as in Simon, 1993, 8111.5), but there is no iden- 
tification between pure thermodynamics phases with primary equilibrium states (as 
in the quantum case), because a state on a commutative C*-algebra like C (n2") is 
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primary iff it is pure. Fortunately, the specific measure-theoretic setting of classi- 
cal statistical mechanics provides its own resources. For any A C Z4, let , be the 
smallest o-algebra (within the Borel o-algebra for n@") for which each f € C(n“) 
is measurable, and let 
= (eis (9.134) 
A 


where each A is finite, be the o-algebra at infinity, with associated commuta- 
tive C*-algebra B..(n2") of all bounded measurable functions on nz" that are 2..- 
measurable. This is the home of the macroscopic observables, defined as averages 
analogously to the quantum case. The role of primary states (or rather of states 
whose algebra of observables is trivial at infinity, as in Theorem 8.23) is now played 
by states that are trivial at infinity, that is, probability measures on n@" for which 
either U(X) = 0 or u(X) = 1 for X € L.. (cf. the Kolmogorov 0-1 law of probabil- 
ity theory). Indeed there is a classical version of Theorem 8.23, making exactly the 
same claim mutatis mutandis, see Theorem I11.1.6 in Simon (1993). The main result 
(cf. Theorem 7.7 in Georgii, 2011), is that a state is extreme in the compact convex 
set of all Gibbs measures (at fixed temperature and potential, of course) iff it is a 
Gibbs measure that is trivial at infinity. It follows that two distinct extreme Gibbs 
measures are mutually singular on .. (which is the pertinent classical version of 
disjointness of primary states). 


89.6. Equilibrium (KMS) states of quantum systems 
The KMS condition was introduced by Haag, Hugenholtz, and Winnink (1967), 
in the following equivalent form: 


y dt f(t —iB)@(ao4(b)) = / dt f (t)@(0%4(b)a), (9.135) 
for each a,b € A and each Schwartz function f € Z(R). The name KMS derives 
from the earlier observation (9.102) of Kubo (1957) and independently Martin & 
Schwinger (1957). See also Haag (1992), Simon (1993), Borchers (2000), Sewell 
(2002), Thirring (2002), Emch (2007), and perhaps also, at a heuristic level, Lands- 
man & van Weert (1987), especially for applications of the KMS condition to quan- 
tum field theory at finite temperature and the quark-gluon plasma (this, incidentally, 
was the MSc thesis as well as the first major published paper by the author). 

The KMS condition also plays a major role in operator algebras and noncommu- 
tative geometry; see Connes (1994) and Connes & Marcolli (2008). 

For a proof of (9.101) see Bratteli & Robinson (1997, Lemma 6.2.21); this book 
is the bible about the KMS condition and its application to quantum spin systems. 

The proof of Proposition 9.25 is taken from Simon (1993), Lemma Iv.4.1 and 
Proposition IV.4.2. The terminology of pure thermodynamical phases for primary 
KMS states (introduced after Theorem 9.31) is not completely standard; also ergodic 
states are sometimes called ‘pure phases’. 


Chapter 10 
Spontaneous Symmetry Breaking 


As we shall see, the undeniable natural phenomenon of spontaneous symmetry 
breaking (SSB) seems to indicate a serious mismatch between theory and reality. 
This mismatch is well expressed by what is sometimes called Earman’s Principle: 


‘While idealizations are useful and, perhaps, even essential to progress in physics, a sound 
principle of interpretation would seem to be that no effect can be counted as a genuine 
physical effect if it disappears when the idealizations are removed.’ (Earman, 2004, p. 191) 


To describe the various examples apparently violating Earman’s Principle (and 
hence the link between theory and reality) in a general way (so general even that it 
will encapsulate the measurement problem), it is convenient to install a definition: 


Definition 10.1. Asymptotic emergence is the conjunction of three conditions: 


I. A higher-level theory H (which is often called a phenomenological theory or 
a reduced theory) is a limiting case of some theory!lower-level L (often called 
fundamental theory or a reducing theory). 

2. Theory H is well defined and understood by itself (typically predating L). 

3. Theory H has features that cannot be explained by L, e.g. because L does not 
have any property inducing those feature(s) in the pertinent limit to H. 


In connection with SSB (as item 3.) we will look at the following pairs (H,L): 


e — His classical mechanics (notably of a particle on the real line R); 
— Lis quantum mechanics (on the pertinent Hilbert space L”(R)); 
— The limiting relationship between the two theories is as described in §7.1 (no- 
tably by the continuous bundle of C*-algebras (7.17) - (7.19) for n = 1). 
e — His classical thermodynamics of a spin system; 
— Lis statistical mechanics of a quantum spin system on a finite lattice; 
— Their limiting relationship is as described in §8.6 (cf. Theorem 8.4). 
e — His statistical mechanics of an infinite quantum spin system; 
— Lis statistical mechanics of a quantum spin system on a finite lattice; 
— The limiting relationship between H and L is given in §8.6 (cf. Theorem 8.8). 
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Of course, there are many other interesting example of (apparent) asymptotic emer- 
gence not treated in this book, such as geometric optics (as H) versus wave optics 
(as L), where the new feature of H would be the absence of interference of light 
rays—foreshadowing the measurement problem of quantum mechanics!— or hy- 
drodynamics (as H) versus molecular dynamics (as L), where the new feature is 
irreversibility. Perhaps space-time asymptotically emerges from quantum gravity. 

The “unexplained” features of H mentioned in the third part of Definition 10.1 are 
often called emergent, although this term has to be used with great care. Its meaning 
here reflects the original use of the term by the so-called “British Emergentists” 
(whose pioneer was J.S. Mill), as expressed in 1925 by C.D. Broad: 


‘The characteristic behaviour of the whole could not, even in theory, be deduced from the 
most complete knowledge of the behaviour of its components, taken separately or in other 
combinations, and of their proportions and arrangements in this whole. This is what I un- 
derstand by the ‘Theory of Emergence’. I cannot give a conclusive example of it, since it is 
a matter of controversy whether it actually applies to anything.’ (Broad, 1925, p. 59) 


In quotations like these, the notion “emergence” is meant to be the very opposite of 
the idea of “reduction” (or “mechanicism’’, as Broad called it); in fact, for many au- 
thors this opposition seems to be the principal attraction of emergence. In principle, 
two rather different notions of reduction then lead (contrapositively) to two different 
kinds of emergence, which are sometimes mixed up but should be distinguished: 


1. The reduction of a whole (i.e., a composite system) to its parts; 
2. The reduction of a theory H to a theory L. 


In older literature concerned with the reduction of biology to chemistry (challenged 
by Mill) and of chemistry to physics (still contested by Broad), the first notion also 
referred to wholes consisting of a small number of particles. That notion of emer- 
gence seems a lost cause, since, as noted by Hempel, 


‘the properties of hydrogen include that of forming, if suitably combined with oxygen, a 
compound which is liquid, transparent, etc.’ (Hempel, 1965, p. 260) 


A similar comment applies to e.g. the tertiary structure of proteins, but also to cases 
of emergence such as ant hills, slime mold, and even large cities (Johnson, 2001), 
all of which are actually fascinating success stories for reductionism. 

More recently, the apparent possibility that very large assemblies of parts might 
give rise to emergent properties of the corresponding wholes has become increas- 
ingly popular, both in physics and in the philosophy of mind (where consciousness 
has been proposed as an emergent property of the brain). In physics, the modern dis- 
cussion on emergence in physics was initiated by P.W. Anderson, who in a famous 
essay from 1972 called ‘More is different’ emphasized the possibility of emergence 
in very large systems (surprisingly, Anderson actually avoids the term ‘emergence’, 
instead speaking of ‘new laws’ and ‘a whole new conceptual structure’). In partic- 
ular, Anderson claimed SSB to be an example (if not the example) of emergence, 
duly adding that one really had to take the N — limit. Thus at least in physics, the 
interesting case for emergence in the first (i.e. whole-part) sense arises if the ‘whole’ 
is strictly infinite, as in the thermodynamic limit of quantum statistical mechanics. 
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This example confirms that |. and 2. often go together, but they do not always do: 
the classical limit of quantum mechanics is a case of pure theory reduction. 
A clear description of emergence has also been given by Jaegwon Kim: 


1. Emergence of higher-level properties: All properties of higher-level entities arise out of 
the properties and relations that characterize their constituent parts. Some properties of 
these higher, complex systems are “emergent”, and the rest merely “resultant”. Instead 
of the expression “arise out of”, such expressions as “supervene on” and “are conse- 
quential upon” could have been used. In any case, the idea is that when appropriate 
lower-level conditions are realized in a higher-level system (that is, the parts that con- 
stitute the system come to be configured in a certain relational structure), the system 
will necessarily exhibit certain higher-level properties, and, moreover, that no higher- 
level property will appear unless an appropriate set of lower-level conditions is realized. 
Thus, “arise” and “supervene” are neutral with respect to the emergent/resultant distinc- 
tion: both emergent and resultant properties of a whole supervene on, or arise out of, 
its microstructural, or micro-based, properties. The distinction between properties that 
are emergent and those that are merely resultant is a central component of emergen- 
tism. As we have already seen, it is standard to characterize this distinction in terms of 
predictability and explainability. 

2. The unpredictability of emergent properties: Emergent properties are not predictable 
from exhaustive information concerning their “basal conditions”. In contrast, resultant 
properties are predictable from lower-level information. 


3. The unexplainabilityfirreducibility of emergent properties: Emergent properties, unlike 
those that are merely resultant, are neither explainable nor reducible in terms of their 
basal conditions.’ (Kim, 1999, p. 21, italics added) 


Similarly, Silberstein (2002) states (paraphrased) that a higher-level theory H: 


‘bears predictive/explanatory emergence with respect to some lower-level theory L if L 
cannot replace H, if H cannot be derived from L [i.e., L cannot reductively explain H], or 
if L cannot be shown to be isomorphic to H”’ 


A key point here is Kim’s no. |: not even “emergentists” deny that the whole con- 
sists of its parts, or, in asymptotic emergence, that the higher-level theory H in fact 
originates from the lower-level theory L. The essence of emergence, then, would be 
that H nonetheless has “acquired” properties not reducible to L. One possibility for 
this to happen could be that the (allegedly) emergent property of H refers to some 
concept that does not even make sense in L, such as the experience of pain, which 
is hard to make sense of at a neural level, but another possibility, which is indeed 
the one relevant to physics and especially to SSB, is that some particular concept 
possessed by H (such as SSB) is admittedly defined within L, but banned. 

In describing the relationship between H and L we have to be clear about the 
difference between approximations and idealizations. Following Norton (2012): 


e An approximation is an inexact description of a target system. 
e An idealization is a fictitious system, distinct from the target system, some of 
whose properties provide an inexact description of aspects of the target system. 


Thus idealizations also provide approximations, but as systems they stand on their 
own and are defined independently of the target system. In our cases, the target 
system is a real physical system such as a ferromagnet or a quantum particle, which 
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is supposed to be described exactly by theory L, i.e., the lower-level theory. In fact, 
L is a family of theories parametrized by 1/N (N € N) or f € (0, 1], and our real 
material relates to some very small value of this parameter (which may also be seen 
as a certain regime of L, seen as a single, unparametrized theory). 

The pertinent theory H is an idealization in the above sense, through which one 
approximates very large systems by infinite ones and highly semi-classical ones 
(where fi is very small) by classical ones (where fi = 0). It is in this setting that 
asymptotic emergence would violate Earman’s Principle and hence would blast the 
relationship between theory and reality: the abstract point (made concrete for SSB 
earlier on) is that if some real property of a real system is described by H but is not 
approximated in any sense by L in any regime (as is the threat with SSB), although H 
is supposed to be a limit of L, then the latter theory L fails to describe the real system 
it is supposed to describe, whereas this systems is described by the theory H, which 
portrays fictitious systems. This marks a difference with other cases of emergence, 
where H (including some “whole’’) is not an idealization but a real system itself (as 
might be the case with consciousness and other examples from neuroscience and 
the philosophy of mind). Thus our discussion does not apply to such cases. 

The tension between SSB and Earman’s Principle has not quite gone unnoticed in 
the philosophy of physics literature. For example, Liu and Emch (2005) first write 
that it is a mistake to regard idealizations as acts of ‘neglecting the negligible’ (p. 
155, which already appears to deny Earman’s Principle), and continue by: 


“The broken symmetry in question is not reducible to the configurations of the microscopic 
parts of any finite systems; but it should supervene on them in the sense that for any two 
systems that have the exactly (sic) duplicates of parts and configurations, both will have the 
same spontaneous symmetry breaking in them because both will behave identically in the 
limit. In other words, the result of the macroscopic limit is determined by the non-relational 
properties of parts of the finite system in question.” (Liu & Emch, 2005, p. 156) 


It is not easy to make sense of this, but the authors genuinely seem to believe in 
asymptotic emergence and hence they (again) appear to deny Earman’s Principle. 
Another suggestion, made by Ruetsche, is to modify Earman’s Principle to: 


‘No effect predicted by a non-final theory can be counted as a genuine physical effect if it 
disappears from that theory’s successors.’ (Ruetsche, 2011, p. 336) 


For example, the theory L explaining SSB should not be quantum statistical mechan- 
ics but quantum field theory (which has an infinite number of ultraviolet degrees of 
freedom even in finite volume, and hence in principle allows SSB). This does make 
sense within physics, but, as Ruetsche herself notices, her principle ‘has the prag- 
matic shortcoming that we can’t apply it until we know what (all) successors to our 
present theories are.’ With due respect, we will describe a rather different way out, 
based on unexpectedly implementing Butterfield’s Principle, which is a corollary 
to Earman’s Principle that removes the reduction-emergence opposition: 


‘there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to 
the limit, i.e. for finite N. And it is this weaker behaviour which is physically real’ 
(Butterfield, 2011, p. 1065) 


To do so, we now turn our attention to specific (classes of) models of SSB. 
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10.1 Spontaneous symmetry breaking: The double well 


The simplest example of SSB is undoubtedly the equation x? = 1 (where x € C), 
which is invariant under a Z2 symmetry given by x +> —x. Its solutions x = +1, 
then, do not share this symmetry; instead Z, acts nontrivially on the solution space. 
Another example that is simple at least compared to quantum spin systems is 
provided by elementary quantum mechanics. Thus we are now in the context of the 
first of the three pairs (H,L) listed in the preamble to this chapter, where, in detail: 


- H is classical mechanics of a particle moving on the real line, with associated 
phase space R? = {(p,q)} and ensuing C*-algebra of observables Ag = Co(R7); 

- Lis the corresponding quantum theory, with a C*-algebra of observables Aj; 
(f > 0) taken to be the compact operators Bo(L7(R)) on the Hilbert space L7(R); 

- The relationship between H and L is given by the continuous bundle of C*- 
algebras (7.17) - (7.19), for n = 1, notably in the classical limit i - 0. 


At the level of states, the passage to the classical limit i — 0 of any h-dependent 
wave-function yy, € L”(IR), if it exists, is described via the associated probability 
measure Uy, On R2, which is defined by (7.31); in other words, 


d" pd"q 


| ap on? va)? (A CR), (10.1) 


Ly, (A) = 


where the (Schrédinger) coherent states g(? Dep (R) are given by (7.27), i.e., 
gi?) (x) = (1h) —"/4¢-ing/ 2h gipx/hie—(x— 4) /2h_ (10.2) 


In terms of the associated vector states @y, on the C*-algebra Bo(L7(IR)), one has 


coyn (OBA) = (Vn Qh Nn) =f, dity(p.a)f(r.a), (10.3) 


where f € Co(IR7). We then say that the wave-functions yj have a classical limit if 


ho0 


lim I, dtu f= f.,, duoS, (10.4) 


for any f € Cy(R*), where Lg is some probability measure on R?. Seen as a state 
on the classical C*-algebra of observables Co(R7), the probability measure [Ug is re- 
garded as the classical limit of the family @y, of states on the C*-algebra Bo(L(R)) 
of quantum-mechanical observables. This family is continuous in the sense that the 
function h ++ Wy,(o(f)) from [0,1] to C is continuous for every continuous cross- 
section o of the given bundle of C*-algebras. An example of such a continuous 
cross-section is o(0) = f and o(f) = O8(f), for any f € Co(R’)), ef. (C.550) - 
(C.551), and indeed this example reproduces (10.4), which after all is just 


lim @y, (Qn (f)) = @o(f) (f € Co(R”)). (10.5) 
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First, let us illustrate this formalism for the ground state of the one-dimensional 
harmonic oscillator. Taking m = 1/2 and V(x) = 4.@7x? in the usual Hamiltonian 


hn =—W 5 +VQa), (10.6) 


it is well known that the ground state is unique and that its wave-function, i.e., 


o \1/4 
vaz)=(55)° or, (10.7) 


is a Gaussian, peaked above x = 0. As h — 0, this ground state has a classical limit, 
namely the Dirac measure [Mo concentrated at the origin (p = 0,q = 0), i-e., 


lim |, dtm f = £(0,0) (f € Co(R°)). (10.8) 


This is just the unique ground state of the corresponding classical Hamiltonian 


ho(p.4) = p- +V(q), (10.9) 


seen as a point in the phase space R* minimizing ho, reinterpreted as a probability 
measure on phase space as explained in the context of Theorem 3.3. Note that we 
kept the mass fixed at m = 1/2, but instead we could have kept fi fixed and take the 
limit m — instead of hi — 0; cf. the preamble to Chapter 7. 

The same features hold for the anharmonic oscillator (with small A > 0), i.e., 


V(x) = box? + LAs". (10.10) 
However, a new situation arises for the symmetric double-well potential 
V(x) = —1w2x? + 1x4 + let /A = 1102-27), (10.11) 


where a = @/V/A > 0 (assuming @ > 0 as well as A > 0). This time, the ground 
state of the classical Hamiltonian is doubly degenerate, being given by the points 
(p =0,q = +a) € R’, with ensuing Dirac measures pg given by 


es dus f = f(0,+a). (10.12) 
But it is a deep and counterintuitive fact of quantum theory that the corresponding 
quantum Hamiltonian (10.6) with (10.11) has a unique ground state. Indeed: 


Theorem 10.2. Let V € L?,.(IR'”) be positive and suppose that lim),|20 V(x) = ©. 
Then —A + V has a nondegenerate (and strictly positive) ground state. 


Roughly speaking, the proof is based on an infinite-dimensional version of the 
Perron—Frobenius Theorem in linear algebra (applied to exp(—th;,) rather than to 
the Hamiltonian hy, itself, so that the largest eigenvalue of the former corresponds to 
the smallest eigenvalue of the latter, i.e., the energy of the ground state). 
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And yet there are two quantum-mechanical shadows of the classical degeneracy: 


e The wave-function yw” of the ground state (which by a suitable choice of phase 


may be taken to be real) is positive definite and has two peaks, above x = +a, with 


exponential decay Ae (x)| ~ exp(—1/f) in the classically forbidden region. 
e Energy eigenfunctions (and the associated eigenvalues) come in pairs. 


In what follows, we will be especially interested in the first excited state wh”, which 


like yn”) is real, but has one peak above x = a and another peak below x = —a. See 


Figure 10.1. The eigenvalue splitting (or “gap”) vanishes exponentially in —1/h like 


Ay = EM) — EO ~ (ha/y/4em)-e-/" (i 0), (10.13) 


where the typical WKB-factor is given by 


dy = [ dx /V(x). (10.14) 


Also, the probability density of each of the wave-functions yi” or yt!) contains ap- 


proximate 6-function peaks above both classical minima -ta. See Figure 10.2, dis- 


played just for wh”, the other being analogous. We can make the correspondence be- 


tween the nondegenerate pair (yi, (1) of low-lying quantum-mechanical wave- 
functions and the pair ging ,Hg ) of degenerate classical ground states more trans- 


parent by invoking the above notion of a classical limit of states. Indeed, in terms of 


the corresponding algebraic states (0) and Oy 1), one has 
h h 
(0) _ 4: (1) __ ,, (0) 
lim Wy, a lim Wy =Uy’; (10.15) 
0 2 
us” = dug +u5), (10.16) 


where [, are the pure classical ground states (10.12) of the double-well Hamil- 


tonian. To see this, one may consider numerically computed Husimi functions, as 
shown in Figure 10.3 (just for yw, as before). From this, it is clear that the pure 


(0) 


(algebraic) quantum ground state yw converges to the mixed classical state (10.16). 
In contrast, the localized (but now time-dependent) wave-functions 
(0) (1) 
NG 
oe 10.17 
Wi v2 ( ) 


which of course define pure states as well, converge to pure classical states, i.e., 


lim we = ux. 10.1 
lim yp = BG (10.18) 


In conclusion, one has SSB in H, but at first sight the underlying theory L seems to 
forbid it. Yet we will now show that (10.17) - (10.18), will save Earman’s Principle. 
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Fig. 10.1 Double-well potential with ground state Wes and first excited state Wes: 
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Fig. 10.2 Probability densities for y{0, « (left) and y(), 4, (right). 


Fig. 10.3 Husimi functions for wy. 5 (left) and yt, 9, (right). 
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10.2 Spontaneous symmetry breaking: The flea 


Regarding the doubly-peaked ground state yh”) of the symmetric double well as 


the quantum-mechanical counterpart of a hung parliament, the analogue of a small 
party that decides which coalition is formed is a tiny asymmetric perturbation 
OV of the potential. Indeed, the following spectacular phenomenon in the theory 
of Schrédinger operators was discovered in 1981 by Jona-Lasinio, Martinelli and 
Scoppola. In view of the extensive (and very complicated) ensuing mathematical 
literature, we just take it as our goal to explain the main idea in a heuristic way. 
Replace V in (10.6) by V + 6V, where OV (i.e., the “flea”) is assumed to: 


1. Be real-valued with fixed sign, and C? (hence bounded) with connected support 
not including the minima x = a or x = —a; 

2. Satisfy |6V| >> e W/" for sufficiently small fi (e.g., by being independent of fi); 

3. Be localized not too far from at least one the minima, in the following sense. 
First, for y,z € R and A CR, we extend the notation (10.14) to 


dy(y,2) =| [ax VO): (10.19) 
dy (y,A) = inf dy (y,z),z € A}. (10.20) 
Second, we introduce the symbols 
dy = 2-min{dy(—a,supp 5V),dy(a,supp 6V)}; (10.21) 
dy = 2-max{dy(—a,supp 5V),dy(a,supp 5V)}. (10.22) 


The localization assumption on dV is that one of the following conditions holds: 


dy <dy <dy; (10.23) 
dy <dy <dy. (10.24) 


In the first case, the perturbation is typically localized either on the left or on the 
right edge of the double well, whereas in the second it resides on the middle bump 
(symmetric perturbations are excluded by 3, as these would satisfy dj, = dj). 


Under these assumptions, the ground state wave-function yi?) of the perturbed 


Hamiltonian (which had two peaks for 6V = 0!) localizes as h — 0, in a direction 
which given that localization happens may be understood from energetic consider- 
ations. For example, if 6V is positive and is localized to the right, then the relative 
energy in the left-hand part of the double well is lowered, so that localization will 
be to the left. See Figures 10.4 - 10.6. Eqs. (10.17) - (10.18) then yield Butterfield’s 
Principle (with N ~» 1/h), so that also Earman’s Principle is saved: the essence of 
the argument is that (at least in the presence of a flea-perturbation) SSB is already 
foreshadowed in quantum mechanics for small yet positive h, if only approximately. 
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Fig. 10.4 Flea perturbation of ground state yw, 5 with corresponding Husimi function. For such 


relative large values of fi, little (but some) localization takes place. 


Fig. 10.5 Same at fi = 0.01. For such small values of fi, localization is almost total. 


Fig. 10.6 First excited state for i = 0.01. Note the opposite localization area. 
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In more detail, for the perturbed ground state we have (subject to assumptions 1-3): 


(8) 
ve ~ et4v/h (L§V > 0, supp(V) C Rt); (10.25) 
VW, (—a) 
Va’ (@) sd 
Ges v/h (£6V >0, supp(V) CR°), (10.26) 
VW, (—a) 


with the opposite localization for the perturbed first excited state (so as to remain 
orthogonal to the ground state). A more precise version of the energetics used above 
is as follows. The ground state tries to minimize its energy according to the rules: 


e The cost of localization (if 5V = 0) is @(e~/"). 
e The cost of turning on 5V is O(e~¢v/") when the wave-function is delocalized. 


e The cost of turning on 5V is O(e~%/") when the wave-function is localized in 
the well around x9 = +a for which dy (xo, supp 6V) = dy. 


In any case, these results only depend on the support of SV, but not on its size: this 
means that the tiniest of perturbations may cause collapse in the classical limit. 

Although the collapse of the perturbed ground state for small fi is a mathematical 
theorem, it remains enigmatic. Indeed, despite the fact that in quantum theory the 
localizing effect of the flea is enhanced for small 4, the corresponding classical 
system has no analogue of it. Trivially, a classical particle residing at one of the two 
minima of the double well at zero (or small) velocity, i.e., in one of its degenerate 
ground states, will not even notice the flea; the ground states are unchanged. But 
even under a stochastic perturbation, which leads to a nonzero probability for the 
particle to be driven from one ground state to the other in finite time (as some form 
of classical “tunneling”, where in this case the necessary fluctuations come from 
Brownian motion), the flea plays a negligible role. For example, in the case at hand 
the standard Eyring—Kramers formula for the mean transition time reads 


(1) 2 28 __viove (10.27) 
V"(a)v"(0) a 


where € is the parameter in the Langevin equation dx, = —VV (x, )dt + /2edW,, in 
which W, is standard Brownian motion. Clearly, this expression only contains the 
height of the potential at its maximum and its curvature at its critical points; most 
perturbations satisfying assumptions 1-3 above do not affect these quantities. 

The instability of the ground state of the double-well potential under “flea” per- 
turbations as / — 0 is easy to understand (at least heuristically) if one truncates the 
infinite-dimensional Hilbert space L7(IR) to a two-level system. This simplification 
is accomplished by keeping only the lowest energy states yw and Wh”, in which 
case the full Hamiltonian (10.6) with (10.11) is reduced to the 2 x 2 matrix 


0 —A 
= 3( ; \ (10.28) 
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with A > 0 given by (10.13). Dropping h, the eigenstates of Hp are given by 


(Ok Fl at re ae aa 


with energies Ey = — 5A and EF; = 5A, respectively; in particular, FE; — Ey = A. If 

0 1 
ox = 2 F9 
0 af? 


% = Gis % = tae (10.31) 


Hence in this approximation On and @ play the role of wave-functions (10.17) 
localized above the classical minima x = +a and x = —a, respectively, with classical 
limits p15. The “flea” is introduced as follows. If its support is in R*, we put 


, (10.30) 


as in (10.17), then 


00 
6,V= € 5) ; (10.32) 
where 6 € R is a constant. A perturbation with support in R™ is approximated by 


60 
6V= & 3) . (10.33) 


Without loss of generality, take the latter (a change of sign of 6 leads to the former). 
The eigenvalues of H) = Hy) +6_V are Ey = E_ and E, = E, with energies 


Ex =1(64V62+<A2), (10.34) 


and normalized eigenvectors 


o)_ 1 2 42 2. A2 re A : 
= (84424507407) (5 fas): (1035) 


®5 V2 
1 -1/2 A 
gi i qlera-s (32+ A?) (5_varca) (10.36) 


Note that lims_,9 9) = gl” for i = 0,1. Now, if h + 0, then |6| >> A, in which 


case oe — @ for +6 > 0 (and starting from (10.32) instead of (10.33) would have 


given the opposite case, i.e., oe —> Oj for +6 > 0). Thus the ground state localizes 
as fi —> 0, which resembles the situation (10.25) - (10.26) for the full double-well. 

In conclusion, in the (practically unavoidable) presence of asymmetric “flea” per- 
turbations, explicit (rather than spontaneous) symmetry breaking already takes place 
for positive f, so that Butterfield’s Principle holds, and hence also Earman’s. 
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10.3 Spontaneous symmetry breaking in quantum spin systems 


Before discussing SSB in quantum spin systems, we return to ground states and 
KMS states as discussed in the generality of §§9.4—9.6. Starting with the former, it 
is natural to ask whether ground states are pure, as would be expected on physical 
grounds; indeed, this question goes to the heart of SSB. Proposition 9.20 implies 
that ground states (for given dynamics) form a compact convex subset S(A).o of the 
total state space S(A); the notation S..(A) (rather than e.g. So(A)) will be motivated 
shortly by the analogy with equilibrium states. It would be desirable that 


OeSeo(A) = Seo(A)N.2S(A), (10.37) 


in which case extreme ground states are necessarily pure. This will indeed be the 
case in the simple models we study in this book, but it is provably the case in gen- 
eral only under additional assumptions, such as weak asymptotic abeliannnes of the 
dynamics, i.e., lim;,.. @([0(a),b]) = 0 for all a,b € A. A weaker sufficient condi- 
tion for (10.37) is that 27@(A)’ be commutative (which is the case if @ is pure). 

We are now in a position to define SSB, at least in the context of ground states. 


Definition 10.3. Suppose we have a (topological) group G and a (continuous) ho- 
momorphism y:G— Aut(A), which is a symmetry of the dynamics in that 


OO Ye = 2° % (Gg € Gt ER). (10.38) 
The G-symmetry is said to be spontaneously broken (at temperature T = 0) if 
(OeSe0(A))° = 0, (10.39) 


and weakly broken if (0.5..(A))% 4 0,S.0(A), ie., there is at least one @ € 0,S..(A) 
that fails to be G-invariant (although invariant extreme ground states may exist). 


Here 4° = {WE .S| WO = Ve € G}, defined for any subset .Y C S(A), is the 
set of G-invariant states in .”. Assuming (10.37), eq. (10.39) means that there are 
no pure G-invariant ground states. This by no means implies that there are no G- 
invariant ground states at all, quite to the contrary: for compact, or, more generally, 
amenable groups G, one can always construct G-invariant ground states by averag- 
ing over G, exploiting the fact that if G is a symmetry of the dynamics, then each 
affine homeomorphism y; of S(A) (defined by ¥;(@) = @ 0g) maps S..(A) to itself. 
Definition 10.3 therefore implies that if SSB occurs, then one has a dichotomy: 


e Pure ground states are not invariant, whilst invariant ground states are not pure. 


Definition 10.4. We call a G-symmetry spontaneously broken at inverse tempera- 
ture B © (0,°¢) if there are no G-invariant extreme B-KMS states, i.e., 


(deSp(A))° =0, (10.40) 


and weakly broken if there is at least one non-G-invariant extreme KMS State. 
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By Theorem 9.31 we may replace extreme B-KMS states by primary B-KMS states, 
so that, similarly to ground states, SSB at nonzero temperature means that: 


e Primary KMS states are invariant, whilst invariant KMS states are not primary. 
For the next result, please recall Definition 9.10 and Theorem 9.11. 


Proposition 10.5. Let A be a quasi-local C*-algebra of the kind (8.130) and sup- 
pose the given G-action Y commutes not only with time translations Oo but also 
with space translation Ty. If Yeo # @ for some @ € O.Sg(A) and g € G, then the 
automorphism Y2 cannot be unitarily implemented in the GNS-representation Tq. 


This is true also at B =, i.e., for ground states. 


Proof. This is an obvious corollary of Proposition 9.13 and Theorems 9.14 and 
9.31: if Y, were implementable by a unitary uy, then upQq A Qy (not even up to 
a phase), since Yq # @. But in that case, since To Yz = Yy 0 T for each x € ZA, 
we would have uxUg = Ugly and hence ux(ugQo) = UgQy. Thus ugQq would be 
another translation-invariant ground state, contradicting Theorem 9.14. 


This result is worth mentioning, since some authors define SSB through the con- 
clusion of this proposition, that is, they call a symmetry 7, (spontaneously) broken 
by some state @ iff ¥, cannot be unitarily implemented in 7. This definition seems 
physically dubious, however, because quantum spin systems may have ground states 
@ that are not G-invariant but in which nonetheless all of G is unitarily imple- 
mentable (in such states translation invariance has to be broken, of course). For 
example, the Ising model in d = 1 with ferromagnetic nearest-neighbour interaction 
and vanishing external magnetic field (where G = Z2) has an infinite number of 
such ground states, in which a “domain wall” separates infinitely many “spins up” 
to the left from infinitely many “spins down” to the right. Although this model has a 
unique KMS state at any nonzero temperature, such ground states (and perhaps anal- 
ogous states at 8 4 co in different models, so far understood only heuristically) seem 
far from pathological and play a major role in modern condensed matter physics. 
Hence we trust this alternative definition only if the states it singles out also satisfy 
Definition 10.3 or 10.4, for which Proposition 10.5 gives a sufficient condition: for 
translation-invariant states and symmetries on quasi-local algebras, our definition of 
SSB through (10.40) is compatible with the one based on unitary implementability. 
This is fortunate, since the physicist’s notion of an order parameter, through 
which at least weak SSB may be detected, is tailored to translation-invariant states: 


Definition 10.6. Let A be a quasi-local C*-algebra A as in (8.130), with symmetry 
group G. A (strong) order parameter in A is an n-tuple @ = (@1,...,Qn) € A” for 
which w() = 0 if (and only if) @ is G-invariant, for any Z4-invariant state @ on A. 


An order parameter defines an accompanying vector field x ++ @(x) by @;(x) = 
T(@). Since @ is translation-invariant, @(@) = 0 is equivalent to @(@(x)) = 0 for 
all x. In the Ising model, with G = Zp), 03(0) is an order parameter, which can 
be extended to a strong one ¢ = (02(0),03(0)). In the Heisenberg model, where 
G = SO(3), the triple (0; (0), 62(0),03(0)) provides a strong order parameter. 


10.3 Spontaneous symmetry breaking in quantum spin systems 381 


Theorem 10.7. Suppose that @ is a (strong) order parameter, as in Definition 10.6. 
Then a G-invariant and translation-invariant KMS state @ € Sg (A)® (including B = 
oo, i.e., a ground state) displays weak SSB—in the sense that at least one of the 
components in its extremal decomposition fails to be G-invariant—if (and only if) 
the associated two-point function exhibits long-range order, in that 


tino (Se a(orets)) =o (10.41) 


Proof. The “if” part of the theorem is equivalent to the vanishing of the limit in 
question in the absence of SSB. Let (9.132) be the extremal decomposition of @. If 
(almost) each extreme state @ is invariant, then @’(@;(x)) = 0 for all i by definition of 
an order parameter, and similarly '(@;(x)*) = @/(@;(x)) = 0. Interchanging lim,_,.. 
with the integral over 05g (A) (which is allowed because y is a probability measure), 
and using (9.30) then shows that the left-hand side of (10.41) vanishes. 

To avoid difficult measure-theoretic aspects of the extremal decomposition the- 
ory, and also for pedagogical purposes, we prove the “only if” part only in the case 


w= | deo, (10.42) 
G 


weakly, where w' € ASg(A) and @, = 7% @'. Since the expression 
n 
/ * 
e(Y) $:(0)*4i(x)) 
i=l 


is independent of g € G (by definition of an order parameter), we may replace a, by 
q’ in the expression for @; the term {(, dg then factors out and is equal to unity. Thus 
we may replace @ in (10.41) by @’. Since @’ is a primary state, we may now use 
(9.30) once again, so that the left-hand side of (10.41) becomes ¥°”_, |@/(@;)|?. By 
assumption, @’ is not G-invariant, so that (by definition of a strong order parameter) 
at least one of the terms |@’(@;)| is nonzero. 


If G is compact, for any C*-algebra A, invariant KMS states (including ground 
states) can always be constructed via (9.133), provided, of course, KMS states (or 
ground states) exist in the first place. Fortunately, existence can be shown in the 
following way. Let A be a quasi-local C*-algebra 4 la (8.130), in which: 


1. dim(H) < © (and hence also dim(H, ) < ~ for any finite A C Z4); 
2. Dynamics is defined locally on each algebra A, = B(H,) via (9.40) and (9.41), 
i.e., with free boundary conditions, having a global limit @ as in Theorem 9.15. 


In that case, by Corollary 9.27 each C*-algebra A, has a unique B-KMS state oh, 
given by the local Gibbs state (9.96). However, if A“) cA), then the restriction 
of the B-KMS state of (2) to Ay”) C Aq is not given as naively expected, namely 


by the B-KMS state of 1)» because the former involves boundary terms. 
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Fortunately, this complication may be overcome, since at least for models with 
short-range forces (cf. Theorem 9.15) one may put 


oP (a) = tim ah, (a), (10.43) 


where Ay is defined in (8.153). This limit exists for a € U,A,, from which of 
extends by continuity to all of A, on which it is a B-KMS state (cf. Theorem 10.10). 

Alternatively, by the Hahn—Banach Theorem (in the form of Corollary B.41) 
combined with Lemma C.4 (which guarantees that any Hahn—Banach extension of 


a state remains a state), each local Gibbs state wh on A, CA extends, in a non- 


unique way, to a state oP on A. This gives a net of states (@ P) on A indexed by 
the finite subsets A of Z4; one may also work with sequences (@ h Since A has 
a unit, its state space S(A) is a compact convex set, so the above net (or sequence) 
has at least one limit point, or, equivalently, has at least one convergent subnet (or 


subsequence), which—despite its potential lack of uniqueness in two respects, i.e. 
the choice of the extensions ab and the choice of a limit point—one might write as 


of = lim, ab. (10.44) 


Without proof, we quote the relevant technical result (assuming 1—2 above): 
Proposition 10.8. Each limit state @? is a B-KMS state (i.e. for the dynamics @). 
Anticipating the existence of SSB in models, one should now feel a little uneasy: 


e It follows from Corollary 9.27 that (at fixed B) there is a unique KMS state on 
each local algebra A, for the given local dynamics of”), namely the local Gibbs 


B 


state @, on Ay. If—as is the case in all our examples—the globally broken G- 
symmetry is induced by local automorphisms vn :A, — A, that commute with 


the local dynamics of”), then each local Gibbs state is G-invariant: this follows 


explicitly from G-invariance of the local Hamiltonian h, and the formulae (9.96) 


- (9.98), or, more abstractly, from the fact if ob were not invariant under all %”, 
B fA) 


it would not be unique (as its translate @, © Y¢ ’ would be another KMS state). 

e And yet (in case of SSB) there exist non-invariant (and hence non-unique) KMS 
states on A, which are even limits in the sense of (10.44) of the above invariant 
(and hence unique) local KMS states on A,! 

e Real samples are finite and hence are described by the local algebras A,, with 


B 


their unique invariant equilibrium states @),. Yet finite samples do display SsB, 
e.g., ferromagnetism (broken Z2-symmetry), superconductivity (broken U(1)). 

e Therefore, the theory that should describe SSB in real materials, namely the finite 
theory A,, apparently fails to do so (as it seems to forbid SSB), whereas the 
idealized theory A, which describes strictly infinite systems and in those systems 
allows SSB, in fact turns out to describe key properties of finite samples. 
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10.4 Spontaneous symmetry breaking for short-range forces 


We continue our discussion of SSB in quantum spin systems, especially of the con- 
struction of global KMS states in the previous section, see (10.44) and preceding 


text. Recall that each finite system A, has a unique B-KMS state oh, namely the lo- 
cal Gibbs state (9.96), but that these states are incompatible for different A’s, in that, 


if A) c A), then the restriction of oF to Ay”) C A, 2) is not given by oP 
because of boundary terms. To correct for this, one introduces the surface energy 


bait) AQ = y P(X), (10.45) 
XCAQ:XNA)40,XNACLO 


with ensuing interaction energy 


ba = lim by yay = y ®(X), (10.46) 


A zd XNAZO,XNACLO 


provided this limit exists (which it does for short-range forces). Now perturb of (2) 
by replacing /, (2) in (9.96) - (9.98) (with A ~+ A (2)) by A, (2) — by) ,(2)- Denoting 
this modification of Or, by FW Ao we obtain (10.47), which implies (10.48): 


B Deas: ae 
OA) 42) = qc) BO qc? (10.47) 

B yh 
(O10 A@)IA,) = May: (10.48) 


If (10.46) exists, we may likewise perturb any f-invariant state @ on A to @a, ie., 


(eBlho—a(ba))/2.O,, Tw(a)e Bro Fa(ba))/2.Q.y) 


\je~Bho—ta(ba))/2.Q,,) ||2 : (10.49) 


@, (a) = 


where A C Z7 is finite, hg is defined as in (9.51) - (9.52), and Qe is in the domain 
of the unbounded operator exp(—B (Aw — To(ba)) /2); the reason is that %m(ba ) is 
bounded, whereas exp(—Bh@/2)Qg = Qe (since hg Qe = 0). For example, 


(oan = COATENOE (10.50) 


where @ = ao) is a Gibbs state on A =A , (2), as in Theorem 9.24 (with A ~» A), 
Indeed, using (9.114) - (9.117) and the relation hg = hy) — Jhy(2)J, where the 
operator J is defined in (9.124), we compute the numerator in (10.49) as 


Tr(((e Blh (2) ~Ih, (2) J-ba)/2, Phyl?) "ge Bh, (2) Ih (2) J-ba)/2, Phyc/?) 


= Tr (ePOaera)a) (10.51) 
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since Jh , (2) J commutes with h, (2) — ba. This subsequently gives 


—Blh, (2) 4h, (24 -ba)/2 


eth, 27 Bh, 2) /2 


—Blh, (2) PANE PIG OTs 


e =e 


= 9 Pini? Phyl? — 15. (10.52) 


Likewise, the denominator in (10.49) equals Tr (exp(—B(h, 2) — ba ))). 

Eqs. (10.50) and (10.48) suggest that if @ = oF isa B-KMS state, then although 
w itself does not localizes to a Gibbs state woe on Ag, its perturbed version of 
does. Under assumptions 1-2 stated in §10.3, i.e., in the situation of Theorem 9.15 
with dim(H) < 9, this motivates the following quantum analogue of the DLR ap- 


proach to classical equilibrium states, i.e., of Definition 9.23: 

Definition 10.9. For fixed inverse temperature B € R\{0} and fixed interaction ®, 
a Gibbs state w? ona quasi-local algebra A with dynamics given by some potential 
® is an Q%-independent state such that for each finite region A C Z4 one has 


oF = af aah, (10.53) 


where oP is the local Gibbs state (9.96) on A, and Oe is some state on A,c. 


Theorem 10.10. Under assumptions 1-2 in §10.3, and if in addition the subspace 
D=U,A, CA is a core for the derivation (9.54) (i.e., the closure of 5 defined on 
D is 6 as defined in Proposition 9.19), then Gibbs states coincide with KMS states. 


The proof is rather technical and so we omit it. It follows that if @8 € S g(A), then 
(8) 4, = 08. (10.54) 


Even so, we still need to define in precisely which sense the net (a?) A, )A con- 


B 


verges to @, (or when perhaps even the net (@4) converges to @,); for simplicity 
we take A = Ay as in (8.153), and just consider sequences indexed by N (rather 
than nets). To this end, let (@; /y)v be a sequence of states with @/y € S(A,,). As 
in Definition 8.24, given some @p € S(A) (if it exists), we say that 


Jim @ jy = © (10.55) 


iff for any sequence (aij) in A with a; /y € Aay CA that converges to a € A one 
has 


jim 04 (aj) = @0(a). (10.56) 


For example, if we take @p € S(A) and define @j /y = Od|Ag. > then (10.55) holds by 
continuity of @p (as || @o|| = 1), which implies that limy,.. @(a1/y) = @o(@). 

It follows from the comments preceding Definition 8.24 that the above notion 
(10.55) - (10.56) of convergence is the same as the one given by (8.164), so that it is 
similar to the convergence of states we defined for the other two classes of examples 
of listed earlier, viz. classical mechanics (cf. §10.1) and thermodynamics. 
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We denote the restriction of some global KMS state oP (defined on A) toA,, CA 
by oP IN? whereas as usual we write oh, for the unique local Gibbs state on A,,,. 
Keeping Definition 8.24 and Proposition 8.25 in mind, the situation is as follows: 


1. Any KMS state ol equals the limit co of its restrictions Oly (i.e. to A,,,). 


2. Each state oP IN differs from the local Gibbs state Oh, (even if oP is unique). 


3. The local Gibbs states oh, typically converge to a KMS state ob, cf. (10.43). 
4. In models with symmetry, this global Gibbs state ob, is invariant (like the oh). 
The first claim follows from the argument given after (10.55). The second is the 
contrapositive to (10.54) and has been explained in 810.3: although the states oP /N 


and Oh are both of local Gibbs type, their Hamiltonians differ from h,,, by the 


boundary term b,. The third claim cannot be proved in general, but in models with 
short-range forces it holds in both forms (10.43) and (10.55) - (10.56). In such mod- 
els the G-symmetry is local, i.e., G acts on each A, through unitaries 


us) = @caug(x); (10.57) 
(aa) = uMau)* (an € Aa,g EG), (10.58) 


where u,(x) € B(H,), leaving each local Hamiltonian h, and hence each local Gibbs 


B 


state OA, invariant. If a € A is local, i.e., a € U,Aag, then 


¥e(a) = lim 1” (an), (10.59) 


followed by continuous extension to a € A, so that, assuming (10.55), 


0 (14(a)) = fim 1 (Ye(an)) = fim ov (94 (ay)) = fim @1 py (an) = @0(a), 


SINCe @y jy © fn pe @/y by assumption. Thus the global Gibbs state ob, inherits 
the G-invariance of its local approximants Oh. In case of SSB, the restrictions of /N 


of some non-invariant extreme KMS state @? determine @?, so that in principle SSB 
: B : : 
is detectable through the local states @, IN: It would be question-begging to construct 


the latter from the global states oP , though, so Butterfield’s Principle (and hence in 
its wake Earman’s Principle) holds only if we can show how and why the states of 
sufficiently large yet finite systems A ,,, tend to oP y tather than to Oh, 
Unfortunately, showing any of this in specific models at finite (inverse) temper- 
ature 0 < B < o is pretty complicated. For example, in the quantum Ising model 
(9.42) in d = 1, KMS states are unique for any B, so that for SSB one must go to 
d > 2. In that case, it can be shown from Theorem 10.7 that for B = 0, below some 
critical temperature (i.e. for B > B,) the Z. symmetry defined in (10.68) below is 
broken, but this takes considerable effort and is beyond the scope of this book. 
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10.5 Ground state(s) of the quantum Ising chain 


It is much simpler to put B = and hence turn to the ground state(s) of the quantum 
Ising model (9.42) in d = 1, which is manageable. The interesting case is B > 0, with 
J = 1 and free boundary conditions, so that for A = Ay (with N even), we have 


hy =— Y (03(x)o3(x+1) + Boy (x); (10.60) 
xeAy 

Ay = {-3N,...,5N—1}; (10.61) 

Hay = Hy = @xcayHx; (10.62) 

H, = C* (x€ Ay), (10.63) 


where the operator 0;(x) acts as the Pauli matrix o; on H, and as the unit matrix 
1 elsewhere. This model describes a chain of N immobile spin-} particles with 
ferromagnetic coupling in a transverse magnetic field (it is a special case of the so- 
called XY-model, to which similar conclusions apply). The local Hamiltonians hy 
define time evolution on the local algebras 


Aay = An = B(An) (10.64) 
by (9.40), i.e., 
(N) __ ,ithy ,_,—ithy 
a, *(an) = e""Nane (a € Ay), (10.65) 
which by Theorem 9.15 defines a time evolution on the quasi-local C*-algebra 
Pall 
A= Av =@&B(A,), (10.66) 


NEN xeEZ 


namely by regarding the unitaries exp(ithy ) € Ay C A as elements of A and putting 


a; (a) = lim eae" (a EA), (10.67) 
N-¥00 

which exists (although the sequence (exp(ithy))y in A does not converge in A). 
For any B € R, the quantum Ising chain has a Z2-symmetry given by a 180- 
degree rotation around the x-axis, locally implemented by the unitary operator 
u(x) = 01(x), which at each x € Ay yields (01,02, 03) ++ (01,02, —03), since 
00,0; = —o; for i # j. Thus u(x) sends each 03(x) to —o3(x) but maps each 
0; (x) to itself. As in (10.57), this symmetry is implemented by the unitary operator 


u%) = @reayOr(x) (10.68) 
on Hy, which satisfies [hy, uN )] = 0, or, equivalently, 


u) py (u%))* = hy. (10.69) 
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The ensuing Z2-symmetry is given by the automorphism YN ) of Ay defined by 

Y%) (a) =u™a(u™))* (a € Ay), (10.70) 
which induces a global automorphism y € Aut(A) as in (10.59), i-e., 


y(a) = tim u™)a(u))* (a A), (10.71) 


N-+oo 


which limit once again exists despite the fact that the sequence u) has no limit in 
A. Thus Z2-invariance of the model follows from the local property 


of) oY) = YM oa), (10.72) 
which in the limit N — oo gives 
HOY=yod (tER). (10.73) 


Since y° = id4, we have an action of the group Z. = {—1,1} on A, where the 
nontrivial element (i.e., g = —1) is sent to y. By (10.72) this group acts on the set 
Sco(Ay) of ground states of Ay relative to the dynamics a), and by (10.73) the 
same is true for the set S..(A) of ground states of the corresponding infinite system 
for a (and analogously for B-KMS states). These sets may be described as follows. 


Theorem 10.11. /. For any N < and B =0 the ground state of the quantum Ising 
model (10.60) is doubly degenerate and breaks the Zz symmetry of the model. 

2. For N < and any B > 0 the ground state a), is unique and hence Z2-invariant. 

3. At N = © with magnetic field 0 < B < 1, the model has a doubly degenerate 
translation-invariant ground state Wp , which again breaks the Zz symmetry. 

4. At N = and B > | the ground state is unique (and hence Z-invariant). 

5. Recall Definition 8.24. For 0 < B <1 the states (@\r))ven (as in no. 2) with 


a”) = (af + a5) (10.74) 
form a continuous field of states on the continuous bundle AQ); in particular, 
jim af), = a)”. (10.75) 


The two ground states in no. | and no. 3 are tensor products of | +) and | |), respec- 
tively (where 03| Tt) =|) and 03| |) = —| {)), so that 03(0) is an order parameter 
in the sense of Definition 10.6. In no. 4, on the other hand, each spin aligns with the 
magnetic field in the x-direction, so that the ground state is an infinite tensor product 
of states | +), where o;| >) =| —), and this time oj (0) is an order parameter. 

Case no. 2 becomes more transparent if we realize the Hilbert space Hy as 
2 (Sy), where Sy is the set of all spin configurations s on N sites, that is, 


s:{-1N,-1N41,...,1N—1} 3 {-1, 1}. 
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In terms of the eigenvectors |1) = | t) and |— 1) = | |) of 03, and the orthonormal 
basis (6,)sesy of @ (Sy) (where 6,(t) = 6,), a suitable unitary equivalence 


vy : (Sv) + Hy (10.76) 
is given by linear extension of 
vy 6; = |s(—5N)---s(4N—1)), 5,t € Sy. (10.77) 


For example, the state |1---1) corresponds to 6,,, where s;(x) = 1 for all x, and 


a 
analogously s(x) = —1 for the state |— 1---—1). Using (?(Sy), we may talk of 
localization of states in spin configuration space (similar to localization of wave- 
functions in L7(IR”)), in the sense that some y € £7(Sy) may be peaked on just a 
few spins configurations. Provided 0 < B < | this is indeed the case for the unique 
ground state in case no. 2, which is similar to the ground state of the double-well 
potential discussed in §§10.1—-10.2, replacing R by Sy (and h > 0 by 1/N). 

Theorem 10.11 and related results used below, such as eq. (10.82), follow from 
the exact solution of the model for both N < co and N = »%, to be discussed in 
§810.6-10.7. This solution is rather involved, but a rough picture of the various 
ground states may already be obtained from a classical approximation in the spirit 
of §8.1. This approximation assumes that the spin-1/2 operators 50; are replaced 
by their counterparts for spin n- 5, upon which one takes the limit n — ©. In this 
limit, the spin operators are turned into the corresponding coordinate functions on 
the coadjoint orbit Oj. C IR? for SU(2), which is the two-sphere Ss p with radius 
r = 1/2. In principle, this should be done for each of the N spins separately, yielding 
a classical Hamiltonian h, that is a function on the N-fold cartesian product of Ss? [2 
with itself. However, if we a priori assume translation invariance of the classical 
ground state, only one such copy remains. Using spherical coordinates 


(x = 5sin@ cos@,y = }sin@ sing,z = }cos@), (10.78) 
the ensuing trial Hamiltonian becomes just a function on @}/, given by 
h(0,0) = —(4cos” 6 + Bsin@ cos). (10.79) 
Minimizing gives cos @ = | and hence y = 0 for any B, upon which 
h(0) = —(4 cos” @ + Bsin@) (10.80) 


yields the phase portrait of Theorem 10.11 for N = ~, as follows. For 0 < B < 1, 
the global minimum is reached at the two different solutions 04 of cos 04 = B, with 
ensuing spin vectors 


x.(B) = (4B,0,+4 V1 —B?), (10.81) 


starting at xi(0) = (0,0,44) and merging at B = 1 to x, (1) = x_(1) = (5,0,0). 
This remains the unique ground state for B > 1, where all spins align with the field. 
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In the regime 0 < B < 1 with large but finite N, one finds a far-reaching analogy 
between the double-well potential and and the quantum Ising chain, namely: 


e The ground state of (10.60) is doubly peaked in spin configuration space, similar 
to its counterpart for the double-well potential in real configuration space. 

e One has convergence to localized ground states (10.15) - (10.16) for the quantum 
Ising chain and (10.74) - (10.75) for the double well. 

e For the energy difference Ay = E . )_E (0 ) between the first excited state and the 
ground state one has (10.17) - (10.18) for the double well, and 


Ay © (1—B’)B™ (N>0), (10.82) 


for the quantum Ising chain. Thus both models show exponential decay, i.e. of 
(10.82) in N as N — 9, and of (10.13) in 1/fA ash — 0. 


It should be mentioned that exponential decay of the energy gap seems a low- 
dimensional luxury, which is not really needed for SSB. All that counts is that 
limy_;co Ay = 0, which guarantees that the first excited state is asymptotically degen- 
erate with the ground state, so that appropriate linear combinations like @j can be 
formed that converge to the degenerate symmetry-breaking pure (and hence physi- 
cal) ground states (or extreme and hence physical KMS states) of the limit system, 
which are localized and stable (as is clear from the double well). The fact that in 
the two models at hand only one excited state participates in this mechanism is due 
to the simple Zz symmetry that is being broken; SSB of continuous symmetries re- 
quires a large number of low-lying states that are asymptotically degenerate with the 
ground state and hence also with each other—one speaks of a thin energy spectrum). 

The existence of low-lying excited states may be proved abstractly (i.e., in a 


model-independent way), as follows. For N < ©, let wy be the ground state (as- 
sumed unique) of some model defined on Ay C Z“, and let @ be an order parameter 
(cf. Theorem 10.7) with accompanying vector field @y = Yc, O(x); in the quan- 
tum Ising chain, we take @ = 0. Then the key assumptions are expressed by 


(0) (0) 


(yi), byw) =0: (10.83) 
(dy, dyy) > C)-N* (N 0, C; > 0); (10.84) 
[Pn Av], Py] || < Co-N (N + 0, C2 > 0). (10.85) 


The first states that the ground state is symmetric, the second enforces long-range 
order, as in (10.41), and the third follows from having short-range forces. A simple 


computation then shows that the unit vector Wh = dyy?) /|\| Pn wy || satisfies 


(Wy) stv Vir’) — (Wart Wy”) S$ Co/(CIN) (Ne). (10.86) 
Since Wi is orthogonal to wy by (10.83), the variational principle for eigenvalues 


(note that hy has discrete spectrum, as dim(H,,,) < °°) then gives Ay < C2/(CiN), 
so that Ay vanishes as N — oo, though perhaps not as quickly as (10.82) indicates. 
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10.6 Exact solution of the quantum Ising chain: N < co 


The solution of the quantum Ising chain is based on a transformation to fermionic 
variables. Let H be a Hilbert space and let F_(H) be its fermionic Fock space, i.e., 


F_(H) = OH", (10.87) 


where H° = C, and for k > 0 the Hilbert space Hk = sel A is the totally antisym- 
metrized k-fold tensor product of H with itself, see also §7.7. Here the projection 


el) : H* — H* is defined by linear extension of 


1 
Of @-- Of = TY senlr)fpa)®--@ fp, (10.88) 


where G, is the permutation group on k objects, and sgn(p) is +1/—1 if p is 
an even/odd permutation. With (total) Fock space F(H) = Of oH k we have 
F_(H) = e_F(H), where e = re ) (strongly) i is a projection. For f € H we define 
the (unbounded) annihilation operator a(f) on F (H) by (finite) linear extension of 


a(f)fi@---@fr= VES, fi)H + ® fis (10.89) 
for k > 0, with a(f)z =0 on H° =C. This gives the adjoint a(f)* =a*(f) as 
a’ (f)fi®--@ fe = VEFIfS@fi @-- @ fy. (10.90) 
For each f € H, we then define the following operators on F_(H): 
e(f) = e-a(f)e-; (10.91) 
c"(f) = e_a*(fpe-. (10.92) 


Note that the map f +> c(f) is antilinear in f, whereas f +> a*(f) is linear in f. It 
follows that c*(f) = c(f)*, that each operator c(f) and c(f) on F_(H) is bounded 


with ||c(f)|] = ||c*(f)|| = |||], and the canonical anticommutation relations hold: 
[e(f),e°(g)l+ = Pi8)a- Lec; (10.93) 
[e(f),e(s)l+ = [e°(F),€°(8)]4+ =0. (10.94) 


Thus we may define CAR(H) as the C*-algebra within B(F_(H)) generated by all 
c(f), where f € H. This is called the C*-algebra of canonical anticommutation 
relations over H, which have constructed in its defining representation on F_(H). 
Choosing an orthonormal basis (e;) of H and writing c(e;) = c; etc. clearly yields 


(ci, ]4 = 6ij le (H)> (10.95) 
[ci,cj]4 = [c7,cj]4 =0. (10.96) 


1? J 
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If dim(H) = N < ©, then CAR(A) = B(F_(A)). First, a dimension count yields 
F_(C*) =@N_ Ht = C? © @NC?. (10.97) 

By Theorem C.90, the C*-algebra CAR(H) acts irreducibly on F_(#), so that 
CAR(C%) © Myw(C). (10.98) 


This is already nontrivial for N = 1. In that case, F_(C) = C@C = C’, and 


00). 
c=0_= € oy (10.99) 


C=0,= (G ‘) ; (10.100) 


where 04 = }(0; ti02). This realization explicitly shows that 


CAR(C) = Mp (C). (10.101) 


To generalize this to N > 1, we introduce a lattice (or chain) N = {1,...,N}, and 
for each x € N we define operators c,,c; by the Jordan—Wigner transformation 


seg x-1 
tee etidyat o+(y)O-(¥) (x) = (Tia) - o_(x); (10.102) 
y=l 
ee x-1 
pS eZ y=1 +09) gs () = (I-90) -6,(x), (10.103) 
y=1 


where x > 1, and cy} = 0; and cj = 0/° (here ox (x) = 4(0)(x) + iop(x)) ete.). 
These operators satisfy (10.95) - (10.96); the second expression on each line follows 
because the operators 6 (y)o_(y) commute for different sites y, and 


ehioT Oo” _ _g, (10.104) 
Furthermore, since 
r 10 
CrCy = O4(x)O_ (x) = & . (x); (10.105) 
* 00 
Cx, = O_(x)04(x)) = 01 (x), (10.106) 


the inverse of the Jordan—Wigner transformation is given by 


wpx-l ix, 

o_(x) =e Mh=19%,, (10.107) 
epx—l ix 

O.(x) = cteMy=19%, (10.108) 
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We return to the quantum Ising model (10.60) with free boundary conditions, 


where we relabel the sites as {1,...,N}, as above, and change to the Hamiltonian 
ee -(¥ 01 (x)o1 (x +1) EAS oats \) (10.109) 
x=1 


where, in order to avoid notational confusion with the operator B in (10.111) below, 
we henceforth replace B ~» A. In terms of the unitary operator u = \/1/2(12 +io2) 
on C? and hence u\%) = @*_,u(x) on @NC?, we have u™hy(u™))* = nly. 

Using (10.102) - (10.103), up to an additive constant AN - 1y we omit, we find 


N 
he =— Y Actes + 1(ct —ex) (chy text), (10.110) 


x=1 


so we now show how to diagonalize quadratic fermionic Hamiltonians of the type 
N 
hy =— Y (Anyctcy + 4By(cic} —exey)) , (10.111) 


where A and B are real N x N matrices, with A* = A and B* = —B. Indeed, taking 


A= 1(S+S*)+A-1y; (10.112) 
B= 1(S—S*), (10.113) 
recovers (10.110), where S : CY —> C% is the shift operator, defined by 
Sf(x) = f(x+1); (10.114) 
S*f(x) = f(x—-1). (10.115) 
By convention, f(N +1) = f(0) =0 (e., Sf(N) = S*f(0) =0 for any f € CX); 
in terms of the standard basis (v,) of CY we have Sv; = 0 and Svx = Vx—1 for 


x = {2,...,N}, and likewise S* vy = 0 and Sv, = Vy+1 for x = {1,...,N—1}. 
The smart thing to do now turns out to be diagonalizing the 2N x 2N-matrix 


A B 
w= (47): (10.116) 


which by a unitary transformation may be brought into the simpler form 


= (EIR) (262) (GRAB) (28) corm 


where C = A + B. For example, for the model (10.111) we simply have 


C=S+A-1y. (10.118) 
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The equations for the eigenvalues € and eigenvectors of M’, i.e., 


1{ Px of Px 
M 2) =a(%) (10.119) 


where @, wz € C%, are equivalent to both the coupled system of equations 


CW = € Gx: (10.120) 
C* Pe = EW (10.121) 
C=A+B, (10.122) 


where the eigenvalues € are real (since M* = M), and to the uncoupled version 


CC* Qj = €7 OK: (10.123) 
CCU = 8 Vi: (10.124) 
CC* = A? — B’ —[A,B); (10.125) 
C*C = A? —B’ +/A,B]. (10.126) 


Without loss of generality we may (and will) assume that the @;, YW are unit vectors 
in C’, so that the corresponding unit vector in C2% is (@x, ye) /V2). Furthermore, 
since C (or M) is a matrix with real entries and the €& are real, by a suitable choice 
of phase we may (and will) also arrange that @;, yw, have real components. Finally, 
it follows from (10.120) - (10.120) that (—@ , YW) is an eigenvector of C with eigen- 
value —€;, so that the unitary transformation U’ that diagonalizes M’, i.e., 


(U')-'M’'U' = & ) , (10.127) 


where E = diag(€),...,€y), takes the form 


| (@ -) 
if SS : (10.128) 
V2 é y 
where @ is the N x N matrix (@1,...,@y), seeing each vector @; as a column, etc. 


Combined with (10.117), we obtain 


a4 _[(-E0\,. 
U mu = ( 0 E)? (10.129) 


11 9-9 Y+O Y-O uv 
=1 : =—1 = 
u=3(7\1) a ‘2 ena aea ea aay 


where we introduced N x N matrices 


(w+); (10.131) 
(y—@). (10.132) 
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Using orthonormality and completeness of both the (@;) and the (y,.), one obtains 


uut+v'v = ly; (10.133) 
uv+v'u =0; (10.134) 
uu +vv" = ly; (10.135) 
uv* +vu* = 0. (10.136) 


Of course, u and v are far from unique, as they depend on both the ordering and 

the phases of the vectors @;, and y;. In partial remedy of the former ambiguity we 

assume that 0 < €) < €; <--- < €y (which can be arranged by a suitable ordering as 

well as choice of sign of the eigenvectors @,). Towards the latter, we already agreed 

that both the @;, and yy; are real, so that also our matrices u and v have real entries. 
We now explain the purpose of diagonalizing M in (10.116) using u and v. 


Proposition 10.12. Let u and v be operators on a Hilbert space H, where u is linear 
and v is anti-linear. Let c(f) and c*(f) be the operators (10.91) - (10.92), satisfying 
the CAR (10.93) - (10.94). Define the Bogoliubov transformation 


nf) =cluf)+e"(vf)s (10.137) 
n*(f) = c*(uf)+e(vf), (10.138) 
which extends to a linear map & : CAR(H) > CAR(H), where n(f) = a(c(f)) ete. 


Then @ is a homomorphism of C*-algebras, or, equivalently, one has the CAR 


In(f),n"(g)]+ = (f.8)H- Las (10.139) 

In(f),n(s)]+ = In*(f),.n"(8)]4+ =, (10.140) 

iff u and v satisfy (10.133) - (10.134), with u~ u,v ~ v. Moreover, is invertible 
(and hence defines an automorphism of CAR(H)) iff in addition (10.135) - (10.136) 


are valid (again with with u ~» u,v ~» v), in which case the inverse is 
c(f) = nu" f)+n*(v"f); (10.141) 
e“(f) =n (uf) +n(v"f). (10.142) 


Note that anti-linearity of v is needed to make f +> (f) anti-linear, like f +> c(f). 
With respect to a base (e;) of H, the transformations (10.137) - (10.142) reads 


ni = Viajes + vjic})s (10.143) 
J 

Nj = Llujic} + Vjic;)s (10.144) 
J 

c= D(wiinji+viin}): (10.145) 


J 
ct =) (ain; + inj). (10.146) 


J 
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Proof. The proof is a straightforward computation. 


In comparison with the preceding diagonalization process, where H = C’, we 
notice that in this process u and v were both linear, whereas in Proposition 10.12 u 
is linear whereas v is antilinear. This difference is easily overcome by taking u = u 
and v = Jv, where J: CY —+ CN is the anti-linear map Jf (x) = f(x), so that J is a 
conjugation in being an anti-linear map that satisfies J* = J~! = J. 

Returning to our generic Hamiltonian (10.111), a straightforward computation 
using (10.145) - (10.146), (10.116), (10.129), and (10.133) - (10.136) yields 


N 


hy = ¥ nine, (10.147) 
k=1 


up to a (computable) constant, where we recall that ¢& > 0 (k = 1,...,N). Note that 
hy is still defined on the fermionic Fock space F_ (Cr ), as hy is a (complicated) 
quadratic expression in the operators c; and c* on F_ (C). The point is that (as a 
consequence of Proposition 10.12) the n, and 1{ also satisfy the CAR, Le., 


[ni Nj ]4+ = 5ij° Lec); (10.148) 
Ini. njl+ = [n; nj ]4+ = 0. (10.149) 


Theorem 10.13. Let A= CAR(CY) be the CAR-algebra over H = CN with dynam- 
ics O%(a) = eae" given by (10.111) and hence by(10.147). Then a has a 
unique (and hence pure and symmetric) ground state Wo, specified by the property 


Tay (1(f)) Qa =0 (f € C%). (10.150) 


Proof, Recall that a defines a derivation 6 : CAR(C’) — CAR(C’) defined by 
(9.54), which in the case at hand is simply by 5(a) = i[hy,al] (since A is finite- 
dimensional, 6 is bounded and hence defined everywhere). Using the identity 


lab, c| = a[b, cl], — [c,a]+b, (10.151) 
as well as the relations (10.148) - (10.149), we obtain 6(n,) = —i€N,, and hence 


—i@o(NE (Nk) = —Oo(Ne Nk)- (10.152) 


The condition —i@p(a*5(a)) > 0, i.e., eq. (9.56) from Proposition (9.20), there- 
fore implies that @o(¢7x) <0, and hence @o(n; nx) = 0 by positivity of wp. Since 
Fo(H) is finite-dimensional and A = B(Fo(H)), cf. (10.98), we may assume ground 
state(s) to be pure and normal, i.e., there is some unit vector Wo € F_(H) with 
@(a) = (Wo,aYo) for each a € A. Hence (Wo, Yo) = 0, which enforces 


Mo =0 (k=1,...,N). (10.153) 


This property makes Yo unique up to a phase. Indeed, together with (10.148) - 
(10.149), eq. (10.153) implies the values of all one- and two-point functions, i.e., 
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oo(n(f)) = @o(n"(f)) = 0; (10.154) 
oo(n*(f)n(g)) = @o(n*(f)n"(g)) = @o(n(f)n(g)) =9; (10.155) 
oo(n(f)n*(g)) = (f,8)H- (10.156) 


Furthermore, the value of @p on any product of an odd number of n(f) and n*(g) 
vanishes; for an even number the value @o(ITj_, 1 (fi) ITj=1 *(g/)) it is given by 


n 


Y (-1)" Pao(n(fi)n* (gp) (Haw II ns): 


p=l J=1,j4P 


Hence (10.153) gives @p on all of CAR(C”’). Since CAR(H) = B(F_(H)), this fixes 
Wo up to a phase. Eq. (10.150) is just a fancy way of rewriting (10.153). 


By construction, the ground state energy of (10.147) is zero. In connection with 
our approach to SSB via Butterfield’s Principle it is of interest to compute the energy 
€; of the first excited state. This may be done from (10.120) - (10.121) with (10.122) 
and the specific expression (10.118) for the quantum Ising chain. Thus we solve 


AWi(x) + W(x +1) = €eQe(x) = 1,...,N,We(N +1) =0); (10.157) 
Ae (x) + P(X 1) = EWe(x) (= 1,..-N, Pe(O) = 0). (10.158) 


A solution of this system (with real wave-functions and positive energy) is given by 


(x) = C(—1)* sin(qe(x —N — 1)); (10.159) 
W(x) = —Csin(qzx); (10.160) 
Ee = [1 +42 +2A.cos(qu), (10.161) 


where C > 0 is a normalization constant, and gq, should be solved from 


sin gx 
N+1 =(k-1 t ————— }. 10.162 
(N + 1)qe = ( )a-+ arctan (, __) ( ) 


For example, for A = 0 (i.e. no transverse magnetic field) we obtain q, = ka/N, 
where k = 1,...,N. For A > 1 there is a unique real solution g; for each k, too, and 
even as N — there is an energy gap € > 0 for each k. For 0 < A < 1, however, 
there is a complex solution gq; = 7+ ip, where p € R is a solution to 


sinhp 


tanh((N + 1)p) => coshp—A° 


(10.163) 


As N -+ 0, we find p = —In(A) — (1 —A7)A2—), Eq. (10.161) then gives 
e(qi) © (1—-A*)AN (N &»), (10.164) 


which, recalling that EW = €; and Eo = 0 and hence Ay = €1, confirms (10.82). 
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10.7 Exact solution of the quantum Ising chain: N = co 


The (two-sided) infinite quantum Ising chain is described by the C*-algebra 
F =CAR(é?(Z)); (10.165) 


one may also consider a one-sided chain, but it lacks translation symmetry. By the 
construction at the beginning of the previous section, F is isomorphic to the infinite 


tensor product A = M2(C)*. We consider F to be generated by the operators ct 
(x € Z), where cy =c, and cf = c*. In this notation, the CAR (10.95) - (10.96) read 


[er cf ]4 = Sys (10.166) 
[c* c+], =0. (10.167) 


Bae 


Although the local Hamiltonians (10.111) do not have a limit as N — ©, as ex- 
plained in §10.5 they do generate a time-evolution on F in the sense of a continuous 
homomorphism @ : R - Aut(F) via (10.65) and (10.67); see also Theorem 9.15. 
Let us first extend the approach in the previous section to N = ~, in which case 
C’ is replaced by H = (°(Z), assuming the theory has already been brought into 
fermionic form with local Hamiltonians (10.111) (as we will see, it is this step, 
i.e., the Jordan—Wigner transformation, that marks the difference between N < oo 
and N =o). Thus we define operators A : (7(Z) + ((Z) and B: (?(Z) + @(Z) 
as the obvious extensions of the N x N matrices A and B to operators on (?(Z), 
and similarly S : ?(Z) — ¢?(Z) is the “full” shift operator, defined by (Sf)(x) = 
f(x+1). Instead of the somewhat clumsy explicit solution procedure sketched in the 
previous section for N < co, we may now simply rely on the Fourier transformation 


F :0(Z) > L’((—2,a]); (10.168) 

(Afi) =f) = Ye fi; (10.169) 
xEZ 

(F'H@=asrey = [" FeMFW, (10.170) 


which diagonalizes A and B to operators A,B : L?([—2,2]) — L?({—7,m]). For the 
quantum Ising Hamiltonian (10.110) these are given by the multiplication operators 


Aw(k) = —(cosk +A) @(k); (10.171) 
Bw(k) = —isink @(k). (10.172) 


For fixed k, the eigenvalues and eigenvectors of the 2 x 2 matrix 


_ {—(cosk+A) —isink 
m= ( isink cosk+A }’ (10.173) 
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are té, given by (10.161) with gx, ~ k. It is then routine to find a unitary 2 x 2 ma- 
ue ‘) that diagonalizes M; in the sense that U, 'MU; = & i: 
Ve Uk O & 
Fourier transforming these multiplication operators back to ¢*(Z) then yields an op- 
erator U on (?(Z) © €°(Z) that satisfies (10.129). This yields a unique ground state 
@ characterized by a property like (10.150) or (10.153), where 


trix U, = 


: ™ dk » : of 
n(f) = / BF Fue ie: (10.174) 
_q 20 
Re ye ey (10.175) 
jeZ 
Go ers (10.176) 
jeZ 


In summary, one-dimensional fermionic models with quadratic Hamiltonians like 
(10.111) have a unique ground state even at N = co. Thus one wonders where SSB in 
the quantum Ising chain could possibly come from. We will answer this question. 
Almost every argument to follow relies on Z2-symmetry. In general, a Z2-action 
on a C*-algebra A corresponds to an automorphism @ : A —> A such that 97 = idy, 
i.e. 0 represents the nontrivial element of Z2. For example, define @ : F + F by 


6(c+) =—-c* (j EZ), (10.177) 


which is an example of a Bogoliubov transformation (cf. Proposition 10.12) and 
hence extends to an automorphism of F (which implies that 0(1-) = 1;). Clearly, 
9? = idp, and in addition each local Hamiltonian (10.111) is invariant under 0; by 
implication, so is the dynamics @, i.e., a0 0 = 00, for allt ER. 

A C*-algebra A carrying a Z2-action decomposes as 


A=A,A_; (10.178) 
Ax = {a€A| (a) = +a}, (10.179) 


where the even part A, is a subalgebra of A, whereas the odd part A_ is not: one 
has ab € A, for a,b both in either A, or A_, and ab € A_ if one is in A; and the 
other in A_. For example, if A = B(H) for some Hilbert space H and w: H — H is 
a untitary operator satisfying w* = 1 (and hence w* = w), then 


0(a) = waw* (= waw) (10.180) 


defines a Z2-action on A. In that case, A+ and A_ consist of all a € A that commute 
and anticommute with w, respectively, that is, 


Ax = {a€A|awFwa=0}. (10.181) 


In case of (10.165) with (10.177), the subspace F', (F_) is just the linear span of all 


products of an even (odd) number of c 7S 
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Let us move to Theorem C.90 and reconsider the proof of the claim that if 
To(A)’ # C-1, then @ is mixed. If the commutant 7 (A)’ is nontrivial, then it 
contains a nontrivial projection e+ € %(A)’. It then follows that ¢;Q 4 0: for 
if e+ QH = 0, then ae, Q® = e+aQy = 0 for all a € A, so that e+ = 0, since Zq is 
cyclic. Similarly, e_Qqo 4 0 with e_ = 1q — e+, so we may define the unit vectors 


O.= 


fay 


+£Q6/\lex Qo) (10.182) 


and the associated states Ws (a) = (Q1,%@(a)Qs)on A. This yields a convex de- 
composition @ =A@,+(1—A)@_, with A = ||Q_||?. SinceA 40,1 and @, 4o_, 
it follows that @ is mixed. The associated reduction is effected by writing 


H=H,O#H_; (10.183) 
Peet (10.184) 


in that A (more precisely, 7(A)) maps each subspace H, into itself. Now pass from 
the projections e+ to the operator w = e, — e_, which by construction satisfies 


w=w l=w. (10.185) 


In particular, w is unitary. Conversely, if some unitary w satisfies w* = 1, then 


e+ =1(1y tw) (10.186) 


are projections satisfying e; + e_ = ly, giving rise to the decomposition (10.184). 
Group-theoretically, this means that one has a unitary Z2-action on H = Ho, in 
which the nontrivial element of Z2 = {—1,1} is represented by w. The decompo- 
sition (10.184) then simply means that Z» acts trivially on H (in that both group 
elements are represented by the unit operator) and acts nontrivially on H_ (in that 
the nontrivial element is represented by minus the unit operator). In conclusion, one 
has a Zp perspective on the reduction of Hq, and instead of a projection e € Tm(A)’ 
one may equivalently look for an operator w € 7@(A)’ that satisfies (10.185). 


Proposition 10.14. Suppose A carries a Zz-action @ and consider a state @:A + C 
that is Z2-invariant in the sense that @(@(a)) = @(a) for all a € A. We write this 
as 0*@ =, with 0*@ = 00 8. Then there is a unitary operator w: Hy + Ho 
satisfying w? = 1y, wQ = Q, and and wto(a)w* = To(O(a)) for eacha€ A. 


Cf. Corollary 9.12. In this situation, we obtain a decomposition of H = Hw accord- 
ing to (10.183), where the projections e+ are given by (10.186), so that, equivalently, 


Hy ={yeH|wyw=+y} =A,Q.. (10.187) 


In terms of the decomposition (10.178), it is easily seen that each subspace H+ 
is stable under A,, whereas A_ maps Hx into Hz. We denote the restriction of 
Tw(A+) to Hs by 24, so that a Z-invariant state @ on A not just gives rise to the 
GNS-representation 2 of A on H@, but also induces two representations 7 of the 
even part A, on H. This leads to a refinement of Theorem C.90: 
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Theorem 10.15. Suppose A carries a Z2-action 0, and let a: A — C be a Zp- 
invariant state. With the above notation, suppose the representation 1,(A,) on Hy. 
is irreducible. Then also the representation m_(A+.) on H_ is irreducible, and there 
are the following two possibilities for the representation Mw(A) on H =H, ® HL_: 


1. %@(A) is irreducible (and @ is pure) iff 14(A 
2. Mw(A) is reducible (and @ is mixed) iff 1(A 


) and m_(A+) are inequivalent. 


) and 1_(A4) are equivalent. 


Proof. The proof of this theorem is much more difficult than one would expect 
(given its simple statement), so we restrict ourselves to the easy steps, as well as to 
two examples illustrating each of the two possibilities. To start with the latter: 


1. A=M)(C), with @(a) = 03203; note that of = 1 and of = 03. Then 


A,= the :) ze} =px(c) (10.188) 


_ 0 z 
A_= 1é 0 ) 521522 € ch, (10.189) 


where D,,(C) denotes the C*-algebra of diagonal n x n matrices. Take Q = (1,0), 
with associated state 

(a) = (Q,aQ), (10.190) 
where a € M2(C). It follows from §2.4 that the associated GNS-representation 
Tw(A) is just (equivalent to) the defining representation of M>(C) on Hy = C’, 
in which the cyclic vector Q of the GNS-construction is Q itself. Since 03Q = 
Q, the state defined by (10.190) is Z2-invariant, and the unitary operator w in 


Proposition 10.14 is simply w = 03. Hence the decomposition (10.183) of H = 
C? is simply C? = C@C, ie., 


Hy. = {(z,0),z€C}; (10.191) 
H_ = {(0,z),z€C}. (10.192) 


Of course, we then have Hi = Ai. Identifying H. = C, this gives the one- 
dimensional representations 74.(D2(C)) as 


yea 0 2 
T+ ( @) gus ) = f+ 


which are trivially inequivalent. Hence by Theorem 10.15 the defining represen- 
tation of Mz(C) on C? is irreducible, as it should be. 


(10.193) 


2. A=D>(C), with 


0 (diag(z+,z_)) = diag(z_,z+), (10.194) 


where we have denoted the matrix in (10.188) by diag(z;,z_). This time, 


Ax = {diag(z,+z),z € C}. (10.195) 


We once again define a Z2-invariant state @ by (10.190), but this time we take 
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on iG) (10.196) 


Hence 


Ha = {(z,£z),2€ C}.. (10.197) 


We may now identify each A+ with C under the map diag(z,+z) > z from 
Ax to C. Similarly, we identify each each subspace Hx with C under the map 
H. — C defined by (z,-+z) +> z. Under these identifications, we have two one- 
dimensional representations 7+ of the C*-algebra C on the Hilbert space C, given 
by 7+(z) =z. Clearly, these are equivalent: they are even identical. Hence by The- 
orem 10.15 the defining representation of D2(C) on C? is reducible, as it should 
be: the explicit decomposition of C* in D2 (C)-invariant subspaces is just the one 
(10.191) - (10.192) of the previous example. 


The first-numbered claim of Theorem 10.15 is relatively easy to prove from The- 
orem C.90. Suppose 74. (A+) are inequivalent and take b € 1(A)’: we want to show 
that b= A-1 for some A € C. Relative to H =H, 6 H_, we write 


_ ( ba+ ds 
b= ia (10.198) 


where the four operators in this matrix act as follows: 


bi,:H,—>H,,b, >. — H,,b -:H,—>H. ,b :H_—-H_. (10.199) 


Since Ay CA, we also have b € %(A+)’. The condition [b,a] =0 for each a € Ay 
is equivalent to the four conditions 


[b ++, 7.(a)] =0; (10.200) 
[b__,m_(a)] =0; (10.201) 
t,.(a)by =b4_n_(a); (10.202) 
m_(a)b_4. =b_4m,(a). (10.203) 


We now use the fact (which we state without proof) that, as in group theory, the 
irreducibility and inequivalence of 7(A,) implies that there can be no nonzero 
operator c : Hy. — H_ such that ct(a) = m_(a)c for all a € Ax, and vice versa. 
Hence b,_ = 0 as well as b_, =0. In addition, the irreducibility of 2(A+) implies 
that by, =A,-14, and b__ = A_- 1y,. Finally, the property [b,a] = 0 for each 
a € A_ implies A, = A_. Hence b =A - 1, and 2@(A) is irreducible. 

To prove the second-numbered claim of Theorem 10.15, let 7, (A+) = a_ (A+), 
so by definition (of equivalence) there is a unitary operator v : H_ — H, such that 


va_(a) = 1(a)v,Va € Ay. (10.204) 


Extend v to an operator w: H — H by 
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Ov 
w= & at (10.205) 


It is easy to verify from (10.204) that [w,z(a)] = 0 for each a € A,. To check 
that the same is true for each a € A_, one needs the difficult analytical fact that 
w is a (weak) limit of operators of the kind z(a,), where a, € A_, which also im- 
plies that w*z(a) € m(A+)”. Since 2(A+)/” = 2(A+)’ and w € 2(A+)’, we obtain 
[w*2(a),w] = 0 for each a € A_. But for unitary operators w this is the same as 
[w, 2(a)] =0. So w € 2(A)’, and hence 2(A) is reducible by Theorem C.90. 


In determining the ground state(s) of the quantum Ising chain, we will apply The- 
orem 10.15 to the C*-algebra (10.87). This application relies on the representation 
theory of F. For the moment we leave the Hilbert space H general, equipped though 
with a conjugation J : H — H. It turns out to be convenient to use the self-dual 
formulation of the CAR, which treats c and c* on an equal footing. Define 


K=H®OH, (10.206) 
whose elements are written as h = (f,g) or h = f+g, with inner product 
(hi ho) k = (fis fo) + (81,82)H- (10.207) 
We then introduce a new operator in CAR(H), namely the field 
P(h) =c"*(f)+c(VJg), (10.208) 


which is linear in h = f+, because the antilinearity of c(f) in f is canceled by the 
antilinearity of J. This yields the anti-commutation relations 


[B* (hy), B(h2)|4 = (a1, h2)K, (10.209) 


but be aware that generally [6*(h,), &*(hz)]4 and [®(h,), B(hz)]4 do not vanish. 
Indeed, in terms of the antilinear operator I’: K — K, defined by 


OJ 
r= On (10.210) 


we have the following expression for the adjoint ®(h)* = &*(h): 
®*(h) = B(Lh). (10.211) 
If we identify f € H with f+0 € K, we may reconstruct c and c* from ® through 


c*(f) = &(f); (10.212) 
c(f) = ®(f). (10.213) 


Bogoliubov transformations now take an extremely elegant form. For any unitary 
operator S on K that satisfies [S,I"] = 0, we define the transform Bs of & by 
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@s(h) = B(Sh), (10.214) 
with associated creation- and annihilation operators (where H 5 f = f+0, as above) 


cs(f) = ®s(f); (10.215) 
cs(f) = ®5(f). (10.216) 


To see the equivalence with the original formulation of the Bogoliubov transforma- 
tion, note that for unitary S, the condition [S,I”’] = 0 is equivalent to the structure 


u vd 
S= & a) ; (10.217) 


where u: H — H is linear, v: H — H is antilinear, and wu and v satisfy (10.133) - 
(10.134). Moreover, from (10.137) - (10.138) we obtain 


(f); (10.218) 
wer (10.219) 


An interesting class of pure states on CAR(#) arises as follows. 
Theorem 10.16. There is a bijective correspondence between: 


e Projections e: K > K that (apart form the properties e? = e* = e) satisfy 
Tel =1x-e; (10.220) 
e States @. on F that satisfy 
@e(P(h)* B(h)) = (heh) VhEK. (10.221) 


Such a state We is automatically pure (so that the corresponding GNS-representation 
Me is irreducible), and is explicitly given by 


@e(B(h,)--- B(han+1)) = 0; (10.222) 
/ n 
We(P(h1)---P(hon)) = YY sgn(p) [ ] (eAsgn(2j) 1 Psen(2j—1)) (10.223) 
PES2n j=l 
the sum &' is over all permutations p of 1,...,2n such that 
p(2j—1) < p(2j); (10.224) 
p(1) < p(3) <-+: < p(2n—-1). (10.225) 


We omit the proof. Note that (10.221) is a special case of (10.223), because of 
(10.211). States like @., which are determined by their two-point functions, are 
called quasi-free; the ground state @p on CAR(C’) constructed in the previous 
section is an example (one also has mixed quasi-free states, e.g. certain KMS states). 
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As a warm-up, we reconstruct the ground state of the free fermionic Hamiltonian 
on F using the above formalism. That is, we assume that hy in (10.111) reads 


N/2-1 
Ss ee cs, (10.226) 
x=—N/2 


initially defining dynamics on Fy = CAR(CY). In that case, the projection eg onto 
the second copy of H = CY in K, ie. 


00 
a= « I): (10.227) 


reproduces the ground state @p(a) = (O|a|0), where |0) is the vector 1 € C in F_(A), 

such that c(f)|0) = 0 for all f € H. This also works for N = ©, i.e., we construct 

dynamics on CAR(¢*(Z)) from the local Hamiltonians (10.226) as indicated at the 

beginning of this section, and use the same formula for eo, this time with H = (7(Z). 
In the more general case (10.111), we replace eg in (10.227) by 


e®) = Ses! (10.228) 


where S is given by (10.217), in which for N < oo the operators u and v were con- 
structed in (10.131) - (10.132). This time, the associated state @ (9) = = Qs is the state 


called wp in Theorem 10.13. As explained at the beginning of this section, this pro- 
cedure even works for N = co and hence H = 7(Z). 

Having understood fermionic models with quadratic Hamiltonians, what remains 
to be done now is to reformulate the original quantum Ising chain, defined in terms 
of the local spin matrices 0; (x), in terms of the fermionic variables c, and c}. For fi- 
nite N this was done through the Jordan—Wigner transformation (10.102) - (10.103). 
This time we need a similar isomorphism between A and F’,, where 


A = ®jezM2(C); (10.229) 
F = CAR(0?(Z)), (10.230) 
and hence we would need to start the sums in the right-hand side of (10.102) - 
(10.103) at 7 = —ce, At first sight this appears to be impossible, though, because 
operators like exp(7i meee 0,.(y)o_(y)) do not lie in A (whose elements have in- 


finite tails of 2 x 2 unit matrices). Fortunately, this problem can be solved by adding 
a formal operator T to A, which plays the role of the “tail” 


ep eMT an 5+ (9) -(9)) > (10.231) 


This formal expression (to be used only heuristically) suggests the relations: 
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T =1: (10.232) 
fea T; (10.233) 
TaT = 6_(a), (10.234) 


where @_ : A + A is a Zp-action defined by (algebraic) extension of 


6_(ox(y)) = —ox(y) (y <0); (10.235) 
6_(0x(y)) = ox(y) (vy > 0); (10.236) 
6_(03(y)) = o3(y) (vy EZ); (10.237) 

-(Oo(y) = Oo(y) (vy € Z), (10.238) 


where 0p = 12. Formally, define an algebra extension 
A=A@QA-T, (10.239) 


with elements of the type a+ bT, a,b € A, and algebraic relations given by (10.232) 
- (10.233). That is, we have 


(a+ bT)* =a* +6_(b*)T; (10.240) 
(a+bT)-(a'+b'T) = ad +b0@_(b') + (ab’+b@_(d))T. (10.241) 


Within A, the correct version of (10.102) - (10.103) may now be written down as 


cf = TeF Lyx 0+ 0)0-0) G+ (x < 1); (10.242) 
ct =To?; (10.243) 
ct = Te MEA g* (> 1), (10.244) 


with formal inverse transformation given by 


Gx. (x) = Te* MES cE (x < 1); (10.245) 
O+(x) = Tey; (10.246) 
ox (x) = Te MEH1+0)°-O)g, (x) (x > 1), (10.247) 


where this time we regard T as an element of the extended fermionic algebra 
F=FOF-T, (10.248) 


satisfying the same rules (10.232) - (10.234), but now in terms of a “fermionic” Z- 
action 0, : F — F given by extending the following action on elementary operators: 
vy) = cy (y $0); (10.249) 


(c 
(c#) = c# (y > 0). (10.250) 
(10.251) 


6_ 
6_ 
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Because of T, the Jordan—Wigner transformation does not give an isomorphism 
A & F, but it does give an isomorphism A & F’. More importantly, if, having already 
defined the Z2-action @ on F by (10.177), we define a similar Z2-action on A by 


6(o1(y)) = —ox(y) (y € Z); (10.252) 
0(03(y)) = 03(y) (y € Z); (10.253) 
8(o0(y)) = doy) (vy EZ), (10.254) 


and decompose A = A; @A_ and F = F, @ F_, according to this action, cf. (10.178), 
we have isomorphisms 


A ps (10.255) 
A_ FT; (10.256) 
A®F, OFT. (10.257) 


For given dynamics (10.111), suppose op is a Zy-invariant ground state on A. Then 
of also defines a Z2-invariant ground state of on F by (10.255) and of (f) =0 for 
all f € F_. Conversely, a Z2-invariant ground state of on F defines a state ag onA 
by (10.255) and a (a) = 0 for all a € A_. But F has a unique ground state, so: 


e Either Wp is pure on A, in which case it is the unique ground state on A; 
e Or @p is mixed on A, in which case @p = S(a@p + @p ), where @ are pure but 


transform under the above Zy-action 0 as @) 08 = wy. 


Theorem 10.15 gives a representation-theoretical criterion deciding between these 
possibilities, but to apply it we need some information on the restriction of Zp- 
invariant quasi-free pure states on F to its even part F.. The abstract setting involves 
a Zo-action W on K that commutes with I” (so that W is unitary, W2 = 1, and 
[',W] = 0), which induces a Zo-action @ on F by linear and algebraic extension 
of 0(@(h)) = &(WA). A quasi-free state @,, defined according to Theorem 10.16 
by a projection e: K — K that satisfies (10.220), is then Z-invariant iff [W, e] = 0. 
In our case, this simplifies to 0(@(h)) = —@(h), so that W = —1, and every 
projection commutes with W. In any case, with considerable effort one can prove: 


Lemma 10.17. Given some Z2-action W on K, as well as a projection e: K > K 
satisfying (10.220), such that [W,I"] = [W,e] =0: 


1. The quasi-free state @. of Theorem 10.16 is Zz-invariant (i.€., We © O = We); 

2. The corresponding GNS-representation space H. = Hg, for F =F, F_ decom- 
poses as H. = H; @H,, with Hf = Fy.Q,. Each subspace H= is stable under 
Te(F.), and the restriction n° of 1(F4.) to He is irreducible. 


Theorem 10.15 then leads to a lemma, which also summarizes the discussion so far. 


Lemma 10.18. /. For given Z-invariant dynamics, let ob be the (unique, Zo- 
invariant) ground state on F = F, © F_. Under F, C F the associated GNS- 
representation space Hy decomposes as Hy = He OA, with Hy = F+Qo, and 
we denote the restriction of %(F4.) to H) by %. Then 1 (F+) are irreducible. 
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2. Regard wf, also as a state @4 on F,. © F_T by putting wf (a) =0 for alla € F_T, 
and similarly as a state ap on A by invoking (10.255) and putting op (a) = 0 for 
alla € A_. Let He = He @H! be the GNS-representation space of F,.® F_T 
defined by a, where Ht = F,Q and H! = F_TQ. Here Ht and H! are stable 


under F-,; we denote the restriction of F4. to Ht by m2, so that nt — igs 


a. Then On is a ground state on A. Any Z2-invariant ground state on A arises in 
this way (via F), so that there is a unique Zy-invariant ground state on A. 

b. The state of is pure on A iff the irreducible representations i (F..) (or 
1 (Fy.)) and a! (F,.) are inequivalent. 


It turns out to be difficult to directly check the (in)equivalence of 2/(F,). For- 


tunately, we can circumvent this problem by passing to yet another (irreducible) 
representation of F.. We first enlarge F to a new algebra 


F=F OFT =F, OF_OF,T OFT, (10.258) 


and extend the state of on F toa state @p on F by putting @o(FT ) = 0, so that @ 
is nonzero only on F; C F. Let fo be the associated GNS-representation of F on the 


Hilbert space Hy = FQ. Under ft (F,) this space decomposes as 


Ao = F, Qo @ F_Qo @ F, TQ @ F_T Qo, (10.259) 


with corresponding restrictions #,.(F,) and #1(F,); more precisely, #4 is the re- 


striction of #(F,) to F,.Qo, whilst #1 is is the restriction of #(F,) to F,TQp. 
Clearly, #(F,.) is the same as 1 (F,.), and #7 (F,.) is just our earlier 27 (F,.), but 
#1 (F,.) is new. To understand the latter, we rewrite (10.259) as 


Ay = Hy OG; (10.260) 
Ho = FQ @ F_Qo & Fy. 20 PB F_Qo; (10.261) 
Ag = F,TQ)F_TQo, (10.262) 


the point being that #(F’) evidently restricts to both Hp and Ag . We know the action 
of #(F') on Ho quite well: it is the representation induced by the ground state @p. As 
to Hg. we define a state oS on F by 


@§ (a) = (#(T)Q0, #(a)#(T) Qo) q, = (Qo, #(O-(a)) 20) 5 (10.263) 


where the second equality follows from (10.234). Comparing Ho and Ao, for all 
b € F (and hence especially for b = 6_(a)) we simply have 

(Qo, #(b) Qo) q, = @o(b) = a (2), (10.264) 
so that an = of 08_=0* of . Decomposing the GNS-representation space Hg» of 
of Tas eof (F) as 6+ of = He: of OH, 


. AT . . . 
et af” it follows that 7 (F;) is the restriction 
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+ . A . fs T . 
OF To» wF (F,.) to H 6° af Therefore, the representation #(F) restricted to Hj is the 
GNS-representation 1px of (F), so that in turn #{(F,) is 1. of (F4.), restricted to 


jek Ae ae Hence, further to (10.260) - (10.262), we obtain the decomposition 


The point is that for the quantum Ising chain Hamiltonian (10.110), we have: 

Lemma 10.19. 7. For each A # +1, we have Tgk (F) = Tg eof (F). 

2. If this holds, then the representations 1) (Fy) = TT" (F,) and m1" (F,) are in- 
0 


equivalent iff the representations Tr, (F,.) and Tae of (+) are equivalent. 
‘i * 


3. For each A # +1, the ground state ag is pure on A iff the representations 
Tye (Fy) and Tg eof (F,.) are equivalent. 


The first claim follows from Theorem 10.20 below. The third follows from Lemma 
10.18 and the previous claims. The second claim is proved by repeatedly applying 
Theorem 10.15 to #(F’). Given this lemma, the real issue now lies in comparing Tegk 
and 7px of both as representations of F (as they are defined) and as representations 
of F, C F. This can be settled in great generality by first looking at Theorem 10.16, 
and thence, recalling the positive-energy projection (10.228), realizing that 


Togk = 4 8): (10.266) 
79+ of = Ty Sw (10.267) 


Here W_ : K —> K is the Z-action on K defining the Z2-action @_ on F as ex- 
plained above Lemma 10.17; specifically, W_ is the direct sum of two copies of 
w_:0(Z) > @(Z), defined by w_(f;) = fj (j > 0) and w_(f;) = —fj Gi < 0). 
Subsequently, without proof we invoke a basic result on the CAR-algebra: 
Theorem 10.20. Let e and é’ be projections on K that satisfy (10.220). Then: 


1. %e(F) = Me (F) iffe—e' € Bo(K); 
2. me (Fy) = 1 (F,) iffe—e' € Bo(K) and dim(eK N (1 — e’)K) is even. 
If the first condition is satisfied, the dimension in the second part is finite, so that 


one may indeed say it is even or odd. From Lemmas 10.18 and 10.19 and Theorem 
10.20, we finally obtain the phase structure of the infinite quantum Ising chain: 


Theorem 10.21. The unique Z2-invariant ground state Wo of the Hamiltonian (10.110) 
is pure (and hence forms the unique ground state) iff both of the following hold: 


e\) _ w_e W_ © Bo(K); (10.268) 
dim(e\) Kn (1 — W_e6 W_)K) is even. (10.269) 


This is true for all A with |A| > 1. If |A| <1, then @ = 1(@g + @p ), where Wy are 
pure and transform under the Zz-action @ as @, 0 8 = a. 
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10.8 Spontaneous symmetry breaking in mean-field theories 


We are now going to study SSB in so-called mean-field theories: these are quantum 
spin systems with Hamiltonians like the Curie—Weiss-model for ferromagnetism: 


= Y* 03(x)o3(y) -—B Y* o1(x), (10.270) 


ae x,yEA xEA 


where J > 0 scales the spin-spin coupling, and B is an external magnetic field. 
Similar to the quantum Ising model, (10.270) has a Zo-symmetry (01, 02,03) > 
(01,62, —03), which at each site x is implemented by u(x) = 01 (x). This model 
differs from its short-range counterpart (9.42), i.e, the quantum Ising model, or the 
Heisenberg model (9.44), in that every spin now interacts with every other spin. It 
falls into the class of homogeneous mean-field theories, which are defined by a 
single-site Hilbert space H, = H = C” and local Hamiltonians of the type 


= q(A) (A A 
hi =A SE es). (10.271) 
Here 7) = 1,, and the matrices cr arg in M,(C) form a basis of the real vector 
space of traceless self-adjoint n x n matrices; the latter may be identified with i times 
the Lie algebra su(n) of SU(n), so that (79,7),..., 7,,2_,) is a basis of i times the 
Lie algebra u(n) of the unitary group U(n) on C”. In those terms, we define 


(oe =a] y? T(x (10.272) 


xEA 


Finally, h is a polynomial (which is sensitive to operator ordering). For example, to 
cast (10.270) (with J = 1) in the form (10.271), take n = 2, T; = $0; (= 1,2,3), and 


Ao* (1, , T, Ts) = —2(T? + BT). (10.273) 


The assumptions of Theorem 9.15 do not hold now, and indeed the local dy- 
namics (9.40) fails to converge to global dynamics on the quasi-local C*-algebra A 
defined by (8.130). Fortunately, it does converge to a global dynamics on the C*- 
algebra C(S(B)), where B = M,C) is the single-site algebra. In order to describe 
the limiting dynamics of (homogeneous) mean-field models as A 7 Z“, we equip 
the state space S(B) with the Poisson structure (8.52), which we now elucidate. 

For unital C*-algebras B, we may regard S(B) as a w*-compact subspace of either 
the complex vector space B* or the real vector space Bj,; in the latter case we regard 
states as linear maps @ : Bx, — R that satisfy @(1g) = 1 and @(a’) > 0 for each 
a © Bg. If B = M,(C), which is all we need, we may furthermore identify B%, with 
iu(n)", and since the value of each state @ € S(M,,(C)) is fixed on Ty = 1p € mln a 
it follows that S(M,,(C)) is a compact convex subset of isu(n)*. In that case, the 
Poisson bracket (8.52) on S(M;,(C)) is none other than the restriction of (minus) the 
canonical Lie-Poisson bracket on su(n)* © isu(n)* to S(M,(C)), cf. (3.98) - (3.99). 
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For example, for n = 2 we have S(M2(C)) & B? C R? by Proposition 2.9, ie., 
Qx.y.2) (a) = Tr(p(x,y,x)a) (x,y,z) € BP,a€ M2(C)); (10.274) 


l+zx-i 
p (x,y,z) = 4 fee jl =) : (10.275) 


We also have su(z)* © R? upon the choice of the basis (7; = 40;), i = 1,2,3, of 
isu(2), which means that 6(,.,-) € isu(2)” maps (T1,72,73) to (x,y,z) (where this 
time (x,y,z) € R°), ef. §5.8). If we now regard the matrices 7; as functions 7; on B? 
by 7;(@) = @(T;), we find that the corresponding functions on B? are given by 


7, (x,y,z) = 4x, (x,y,z) = 49, T3,y,z) = bz. (10.276) 


The corresponding Poisson brackets (8.52) are {T,7)} = —273 etc., ie., {x,y} = 
—2z etc.; this is —2 times the bracket defined in (3.43) or (3.97) - (3.98). This factor 
2 could have been avoided by moving to the three-ball with radius r = 1/2 instead 
of r = 1, whose boundary is the coadjoint orbit 0 /2 naturally associated to spin-5. 


We now return to our continuous bundle of C*-algebras A) of Theorem 8.4, of 
course in the slightly adapted form appropriate to quantum spin systems, see 88.6. In 


particular, we recall that Aj,’ = C(S(B)) and A\),, = B(Hay), ef. (8.157) - (8.158), 
and hence we see the limit N —> as a specific way of taking the limit A 7 Z¢ 
along the hypercubes Ay. Symmetric and quasi-symmetric sequences (a1/y)ven 


are defined as explained after (8.161). The following observation is fundamental. 


Theorem 10.22. Let B = M,(C). If (ain )wen and (bi/y)wen are symmetric se- 
quences with limits ag and bo as defined by (8.46), respectively (so that (a1/N) ven 
and (b,j) yey are continuous sections of the continuous bundle A)), then the se- 
quence 


({a0, bo}, i[a1,b1],-.-,iAn| [ary Diy], : --) (10.277) 


defines a continuous section of A“. In particular, for each @ € S(B) we have 
i lim 4% (\Ay|[aijv,1/v]) = {40,b0}(@). (10.278) 


Proof. The proof is a straightforward combinatorial exercise, and we just mention 
the simplest case where d = 1 and aj /y = Sin(a1) and bin = Sin(b1), where 


A 


a, € Band b; € B, cf. (8.39). Then ap = 41, bo = bj, and similarly to (8.45) we find 


1 
[Si.w(a1),Siw(b1)] = yin (la1,b1]), (10.279) 
Using (8.52), we find that (10.277) is equal to (ia ,b,],...,S14({a1,b1]),...). Since 
@ (Sv ({a1,b1])) = @([a1,b1]), the left-hand side of (10.278) is therefore equal to 
i@([a,,b;]), which by (8.52) equals the right-hand side. 
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In other words, although the sequence of commutators [a, [n> 1/ | converges to zero 
(which is why Aw has to be commutative!), the rescaled commutators iN[a, N91 in] 
converge to the macroscopic observable {ao,b0} € C(S(B)). This reconfirms the 
analogy between the limit N — oo and the limit f — 0 of Chapter 7, see especially 
Definitions 7.1 and 8.2. With B = M,,(C), Theorem 10.22 implies the central result 


about the macroscopic (and hence classical!) dynamics of mean-field theories: 


Corollary 10.23. Let (hiv) yey be a continuous section of A‘) defined by a sym- 


metric sequence, and let (a,/n) yew be an arbitrary continuous section of A) (i.e. 
a quasi-symmetric sequence). Then, writing hy ;y = hay for clarity, the sequence 


(ao(t),e@'ae oe lant ay ye AN" oo ) j (10.280) 


where ao(t) is the solution of the equations of motion on S(M,(C)) with classical 
Hamiltonian ho and Poisson bracket (8.52), defines a continuous section of A, 


In other words, the Heisenberg dynamics on A,, = B(Ha,) defined by the quan- 
tum Hamiltonians /,,, converges to the classical dynamics on the Poisson manifold 


S(M,,(C)) that is generated by their classical limit, viz. the Hamiltonian ho. 
For example, since the operators i form symmetric sequences, so do Hamil- 
tonians of the type (10.271). The limit ho € C(S(M,,(C))) of the family (h,) in 


‘) in the function h by the 


(A) 


(10.271) is simply obtained by replacing the operators 7; 


functions 7; on S(M,(C)). Equivalently, one may replace the 7;“"’ by the canon- 

ical coordinates (6;) of isu(n)" dual to the basis (7i,...,T,2_1) of isu(n)*, ie., 

6;(7;) = 6;;, and restricting the ensuing function on isu(n)* to S(M,,(C)) C isu(n)*. 
Using (10.276), for the Curie-Weiss model (10.270) with J = | this gives 


h§ (x,y,z) = —4z° — Br. (10.281) 


The ground states of this Hamiltonian are simply its minima, viz. 


x. = (B,0,4\/1—B?) (0<B<1); (10.282) 
x = (1,0,0)) (B>1), (10.283) 


all of which lie on the boundary S? of B?. Note that the points x4 coalesce as B > 1, 
where they form a saddle point. Modulo our use of radius r = | instead of r = 1/2, 
this result coincides with (10.81) for classical limit of the quantum Ising model. 

We now turn to symmetry and its possible breakdown. Suppose there is some 
subgroup of U(n), typically the image of a unitary representation g++ u, of a com- 
pact group G on C”, under which h(7,7,..-,T,,2_,) in (10.271) satisfies 


A(To,ugT Ug, ---,UgTy2_ Uz) = (To, T1,..-,Ty2_1) (8 € G). (10.284) 


For example, in the Curie-Weiss model one has G = Z, whose nontrivial element 
is represented by 6}. For (10.271) itself this implies uw“) hy (u\))* = hy, cf. (10.69). 
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Hence also in homogeneous mean-field models we obtain the structure (10.57), 
(10.58), and (10.59) familiar from the case of short-range forces. For the limit theory 
this implies that the classical Hamiltonian ho on S(M,,(C)) is invariant under the 
coadjoint action of G C U(n) on isu(n)*, restricted to S(M,(C)) C isu(n)*: in the 
Curie-Weiss model this “classical shadow” of the Z2 symmetry of the quantum 
theory is simply the map (x,y,z) + (x,—y,—z) on B?. 

In the regime 0 < B < 1, the degenerate ground states of this model break this 
symmetry. In contrast, it can be shown from the Perron—Frobenius Theorem (which 
applies since both 03 and oj are real matrices) that for B > 0 each quantum- 


mechanical Hamiltonian (10.270) has a unique ground state wy. Being unique, 
this vector must share the invariance of hy under the permutation group Gy, so that 


N 
wy = ¥ c(ny/N)|ns,n-), (10.285) 


ny =0 


where |n,,n_) is the totally symmetrized unit vector in @VC? with n, spins up 
and n_ = N—n, spins down, and c: {0,1/N,2/N,...,(N —1)/N,1} — [0,1] is 
some function such that ¥°,,, c(n4/N )? = 1 (we may assume c > 0 by the Perron— 
Frobenius Theorem). The asymptotic behaviour of c as N — oo has been studied, 
and as expected, c to converges pointwise to c(0) =c(1) = \/1/2 and c(x) =0, and 
zero elsewhere (at B = 0 one of course has either c(0) = 1 or c(1) = 1 for all N). 

Thus we encounter a familiar headache: the “higher-level” theory C(S(M,(C))) 
at N =o breaks the Z2 symmetry, whereas the “lower-level” quantum theories 
B(H, he (N < ce) do not, although the former should be a limiting case of the latter. 
Indeed, the situation for the Curie-Weiss model in the regime 0 < B < 1 is exactly 
analogous to the double-well potential as well as to the quantum Ising model in the 
same regime: if the two degenerate ground states x. € B? of hg are reinterpreted as 
Dirac measures 5, on B?, which in turn are seen as (pure) states @ 4 on the classical 
algebra of observables C(S(M2(C))), then (10.74) holds, mutatis mutandis. 

The resolution of this problem through the restoration of Butterfield’s Principle 
should also be the same as for the previous two cases: there is a first excited state 
yh? such that as N —> ©, the energy difference with the ground state approaches 
zero and one has approximate symmetry breaking as in (10.75)). Alas, for the Curie— 
Weiss model so far only numerical evidence is available supporting this scenario. 

Equilibrium states of homogeneous mean-field models at any inverse tempera- 
ture 0 < B < exist, despite the fact that in such models time-evolution a, on the 
infinite system A (and hence the KMS condition characterizing equilibrium states) 
is ill-defined (unless one passes to certain representations of A, which would be 
question-begging). Instead, one invokes the quasi-local C*-algebra A, cf. (8.130), 
and in lieu of KMS states looks for limit points @F € § (A) of the local Gibbs states 


oh, defined by (9.96) as N — oo; see (10.44) and surrounding discussion. Proposi- 


tion 10.8 does not apply now, but Theorem 8.9 does: since each local Hamiltonian 


Ray (Ay 


Oh. and accordingly, each w*-limit point of this sequence must share this property. 


is permutation-invariant (because each 7, ) is), so is each local Gibbs state 
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As in (8.174), from the quantum De Finetti Theorem 8.9 we therefore have: 


AB _ | B\* 
OP = dug(@) | @ : 10.286 
S(Mn(C)) pl ) ( 5) 


for some probability measure [1g on the single-spin state space S(M,(C)). By Propo- 
sition 8.28, this measure may also be regarded as a limit of the local Gibbs states, but 


now regarded as a state on the limit algebra Aw) =C(S(M,(C))) rather than as a state 
(4) 


on Aj’ =A. ne the same token, each state we in the decomposition (10.286) is a 


pure state on Ac ) (though seen as a state on M,,(C) it will be mixed!). The states ob 


are computed as follows. Given a classical Hamiltonian ho computed from (10.271) 
as explained after Corollary 10.23, for each point 0 = (@p,...,8,2_,) € iu(n)” we 
define a new self-adjoint operator hg € M,,(C) by 


(0) T;. (10.287) 


For example, in the Curie-Weiss model, from (10.273) we have 


hg (8) = —2( 0 + BO); (10.288) 
AS = nf" (0) — 20303 — Bor. (10.289) 


Eq. (10.287) has the following origin. Let @ be any state on A for which the strong 
limit rin ) of each operator 7 ri" )) on Hw exists as N — ce (for example, as in the 
proof of Theorem 8.16 one may show that this is the case when @ is a permutation- 
invariant state of A). It easily follows that i lies in the algebra at infinity for 7, 


and hence in the center of %(A)”, cf. §8.5. If, in addition, @ is primary, then 


T”) = 6;- lng (10.290) 
6; = lim o(7|*"’). (10.291) 
N- co 


Under these assumptions, we compute the commutator 


hay) Ral) =D Spe (Ts T$,)- Yas) Rela) +0 (Go). 


xEAN |An| 


where a € U,Ag, and O(1/|Ay|) denotes a finite sum of (multiple) commutators 


between some power of a and operators that are (norm-) bounded in N. For 
example, for the Curie-Weiss model the O(1 /|Ay]|) term is a multiple of 


Y [[to(o3(x)), 20(a)], 0”). (10.292) 


x€Ay 
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Since a is local, all commutators Y.<4, [%o(Ti(x)),2o(a)] are in 2(A), so that fur- 


ther commutators a la (10.292) vanish as N — oo. Also, in this limit the terms tT) 


in the argument of Y;0/9/06; assume their c-number values 6;, so that 


Jim [20 (hay), %0(@)] = ho, %o(a)], (10.293) 


where formally (i.e. on a suitable domain) we have an w-dependent Hamiltonian 


ho = ¥. x(he(x)), (10.294) 


xeZd 


where the 0; depend on @ via (10.291). Also, for each a € A one has strong limits 
Jim te (elw'ae™ Mav") = ello g(g)enthot | (10.295) 
— oo 


Hence in the limit NV = oo (provided it makes sense, which it does under the stated 
assumptions), the original mean-field Hamiltonian (10.271) with its homogeneous 
long-range forces converges to a sum of single-body Hamiltonians, in which the 
original forces between the spins have been incorporated into the parameters 6). 


Returning to (10.286), for any B = T~!, we now determine wb from the Ansatz 


—Bhe 
fox) ite a) 
, (a) = ——_—, (10.296) 
(= (e-Bhe ) 
where @ is found by by solving the self-consistency equation 
ob = 6. (10.297) 


As explained after Corollary 10.23, here of : My, (C)sa — R is defined by its val- 
ues on isu(n) and hence should be seen as a map isu(n) — R, like @ € su(n)*, 
so that (10.297) consists of n* — 1 equations ob (T;) = 0; i =1,...,n*—1). Al 
ternatively, one may extend @ from isu(n) to iu(n) by prescribing 0(1,) = 1, and 
subsequently extend it further to M,,(C) by complex linearity. Clearly, the constant 
ho(@) in (10.287) drops out of (5.152) and may be ignored in solving (10.297). 

For example, if we take (10.289) with B = 0, then (10.297) forces 0; = 0 = 0, 
whereas the magnetization 203 = m= of (03) satisfies the famous gap equation 


tanh(Bm) =m. (10.298) 


For any f this has a solution m = 0, i.e., @ = 0 in B?, which corresponds to the tracial 
state (a) = 5Tr(a) normally associated with infinite temperature (i.e., B =0). This 
state is evidently Z2-invariant. For T > T, = 1/4 (i.e. B < 4) this is the only solution. 
For T < T, (or B > 4), two additional solutions Emp (with mg > 0) appear, which 


break the Z) symmetry. For B > 0 computations become tedious, but for B — , 
B 


where @, converges to the ground state of hg, one recovers our earlier conclusions. 
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Proposition 10.24. The self-consistency equation (10.297) has at least one solution. 


Proof. This follows from Brouwer’s Fixed Point Theorem (stating that any contin- 
uous map f from a compact compact set K C R* to itself has a fixed point), applied 


to K = S(M,(C)) and f(@) = ob, where 6 € S(M,,(C)), as just explained. 


The key result on equilibrium states of homogeneous mean-field theories, then, is: 


Theorem 10.25. Let hy in (10.271) define a homogeneous mean-field theory with 
compact symmetry group G. The sequence (ok ) of local Gibbs states defined by 
(9.96) and (10.271) has a unique G-invariant limit point @?, whose decomposition 
into primary states is given by (10.286). The G-invariant probability measure Lg is 
concentrated on some G-orbit in S(M,(C)), and the states of on M,,(C) are given 
by (10.296), with Hamiltonians he defined by (10.287), where @ satisfies (10.297). 


Proof. We just sketch the proof, which is based on the Quantum De Finetti Theorem 
8.9. Each operator rn dis permutation-invariant, which property is transferred first 


to each local Hamiltonian /,,,, thence to each local Gibbs state Oh, defined by ha,,, 
and finally to each limit point of this sequence. As already noted, Theorem 8.9 then 
gives the decomposition (10.286), which by Theorem 8.29 (whose assumption holds 
in mean-field models) also gives the primary decomposition of PF (i.e., each state 
(of )* is primary on the quasi-local algebra A). By our earlier argument centered 
on (10.294) - (10.295), time-evolution is implemented in the GNS-representation 
induced by such a state. An important step in the proof—which we omit because 
it requires various reformulations of the KMS condition we have not discussed—is 
that (of )* satisfies the KMS condition with respect to the dynamics (10.295). This, 
in turn, implies (10.296), which, by definition of @ through (10.290) - (10.291), 
gives the self-consistency condition (10.297). The proof is completed by a tricky 
argument (which again uses alternatives to the KMS condition) to the effect that 
if some of breaks the G-symmetry, the probability measure [lg on the G-orbit in 


S(M,(C)) through wb induced by the normalized Haar measure on G, defines the 
only possible limit point of the local Gibbs states, and hence must be unique. 


Thus SSB can be detected by solving (10.297) and checking if the ensuing state(s) 


wb on M,,(C) is (are) G-invariant. As we have seen, in the Curie-Weiss model this 
is the case for B < 4, whereas for B > 4 the measure Hg in (10.286) is given by 


Lg = 5(5(0,0,mg/2) + 9(0,0,-mg /2)), (10.299) 


where d9(f) = f(@). In such cases, since each local Gibbs state is invariant, one 
faces the (by now) familiar threat to Earman’s Principle. In response, we expect 
Butterfield’s Principle to be restored through the introduction of asymmetric flea- 
type perturbations to h, that are localized in spin configuration space, although at 
nonzero temperature all excited states (rather than just the first) will start to play a 
role, and the precise details of the “flea” scenario remain to be settled. 
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10.9 The Goldstone Theorem 


So far, we have only discussed the simplest of all symmetry groups, namely G = Zo, 
which is both finite and abelian. Although it will not change our picture of SSB, 
for the sake of completeness (and interest to foundations) we also present a brief 
introduction to continuous symmetries, culminating in the Goldstone Theorem and 
the Higgs mechanism (which at first sight contradict each other and hence require a 
very careful treatment). The former results when the broken symmetry group G is a 
Lie group, whereas the latter arises when it is an infinite-dimensional gauge group. 

Let us start with the simple case G = SO(2), acting on R? by rotation. This 
induces the obvious action on the classical phase space T*R?, i.e., 


R(p,q) = (Rp,Raq), (10.300) 


cf. (3.94), as well as on the quantum Hilbert space H = L”(IR7), that is, 
urw(x) = w(R7!x). (10.301) 


Let us see what changes with respect to the action of Zz on R considered in §10.1. 
We now regard the double-well potential V in (10.11) as an SO(2)-invariant function 
on R? through the reinterpretation of x? as xe +x3. This is the Mexican hat potential. 
Thus the classical Hamiltonian h(p,q) = p*/2m+V(q), similarly with p? = p} + p3, 
is SO(2)-invariant, and the set of classical ground states 


&) = {(p,q) € T*R’ | p=0,q° =a°} (10.302) 


is the SO(2)-orbit through e.g. the point (p) = p2 = 0,q1 = a,q2 = 0). Unlike the 
one-dimensional case, the set of ground states is now connected and forms a cir- 
cle in phase space, on which the symmetry group SO(2) acts. The intuition behind 
the Goldstone Theorem is that a particle can freely move in this circle at no cost 
of energy. If we look at mass as inertia, such motion is “massless”, as there is no 
obstruction. However, this intuition is only realized in quantum field theory. In quan- 
tum mechanics, the ground state of the Hamiltonian (10.6) (now acting on LV’ (IR?)) 
remains unique, as in the one-dimensional case. In polar coordinates (r,@) we have 


we (a? 10 1 2 
SO, iS Eee Ye +) +V(r), (10.303) 
with V(r) = 5A(r? —a*)?. With 
1(R’) =L*(R*) @ EZ) (10.304) 


under Fourier transformation in the angle variable, this becomes 


i (a2 10 wn 
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Since fi7n /2mr? is positive, the ground state yn”) has yi”) (r,n) = 0 for all n £0, 


and hence it is SO(2)-invariant, since the SO(2)-action on L?(IR*) becomes 
ug W(r,n) = exp(inO)w(r,n), (10.306) 


after a Fourier-transform. Indeed, from a group-theoretical point of view, the unitary 
isomorphism (10.304) is nothing but the decomposition 


L?(R’) = DAn, (10.307) 


neZ 
where H,, = L?(IR*) for all n, but with ¢, € Hy transforming under SO(2) as 
ugd,(r) = exp(in@)d,(r) (0 € [0,27]}). (10.308) 


The SO(2)-invariant subspace of L?(IR7), then, is precisely the space Ho in which 
yh”) lies. This is analogous to the situation occurring in one dimension higher (i.e. 


IR) with e.g. the hydrogen atom: in that case, the symmetry group is SO(3), and 
L?(IR*) decomposes accordingly as 


L?(R3) ~ Daj: (10.309) 
jen 
H; = L?(R*)@c¥*!, (10.310) 


The ground state for a spherically symmetric potential, then, lies in Hp and is SO(3)- 
invariant. For our purposes the relevant comparison is with the one-dimensional 
case: the decomposition of L?(R) under the natural Z2-action u_; w(x) = w(—x) is 


L?(R) = eM (10.311) 
H; = {we L?(R) | v(x) = (-1)'w(-a)}, §=0,1. (10.312) 
(0) 


This time, H+ is the Z2-invariant subspace containing the ground state yw 
Z-invariant, yi”) is has peaks above both classical minima +a; in fact, yi”) is real- 


valued and strictly positive. The ground state of the corresponding two-dimensional 
system, seen as an element of L?(R7), is just this wave-function yi” 


R to R? by rotational invariance. Hence the ground state remains real-valued and 
strictly positive, with peaks about the circle of classical minima in R?. 


. Being 


extended from 


Let us recall the situation for d = 1 (cf. 810.1). The first excited state yh!) lies 


in Hj; it is real-valued, like yh”), but since it has to satisfy yt) (-x) = —Wy(x), 


it cannot be positive. Indeed, with a suitable choice of phase, yh? has one positive 
peak above a and the same peak but now negative below —a. Then the wave-function 


ve = (yt yh?)v2, (10.313) 
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(1) 


is peaked above -ra alone (i.e., the negative peak of +y;, © below +a exactly cancels 


(0) 


the corresponding peak of Wh. The classical limit of y,"’ comes out as the mixed 
state 1(@ + @p ), where @ = (p =0,+a), but each state y;" has the pure state 
@p as its classical limit. The latter are ground states, and hence in particular they 


are time-independent, because the energy difference FE () — £0) between yt? and 


(0) 


yw, vanishes (even exponentially fast) as h — 0. 
A similar but more complicated situation arises in d = 2. The role of the pair 


(vy? €Ho, yy) eH) 


is now played by an infinite tower of unit vectors 
(vi €Ay,ne€ Z) : 


where yh” is the lowest energy eigenstate (for hp in (10.305)) in H, C L* (IR). The 
analogue of the states y,; for d = | involves a limit which heuristically is like 


(N, 
li s 10.314 
jim, va Seam 20 ee 


but this limit does not exist in L7(IR7). As in §10.1, we instead rely on the technique 
explained around (10.4), which makes the unit vectors yin) converge to some 
probability measure ue on R? as N > ©. In the subsequent limit  — 0, one obtains 
a probability measure ue concentrated on a suitable point in the orbit of classical 
ground states (10.302). Similarly, in the same sense the ground state yi”) converges 
to a probability measure supported by all of é. 

To the extent that there is a Goldstone Theorem in classical mechanics, it would 
state that motion in the orbit & is free. That is, at fixed (r = a, p, = 0), where p; is 
the radial component of momentum, one has an effective Hamiltonian 

Ps 
2ma?’ 


ha(po,9) = (10.315) 


whose time-independent states (pg = 0,0) for arbitrary @ € [0,27) yield the 
ground states of the system, and whose “excited states” 


(po(t),6()) = (r9(0),0(0))+ ne) (10.316) 


ma 


give motion along the orbit & with effective mass ma”, whose energy converges 
to zero as py — 0. However, since massless particles (whose existence is the main 
conclusion of the usual Goldstone Theorem) are not defined in classical mechanics, 
we now turn to relativistic field theory (with which we assume some familiarity). 
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We now illustrate SSB in classical field theory through a simple example, where 
the symmetry group is G = SO(N), but whenever write things down in such a way 
that the generalization to arbitrary scalar field theories is obvious. Suppose we have 
N real scalar fields 9 = (@1,...,@y), on which SO(N) acts in the defining represen- 
tation on R’. Following the physics literature, from now on we sum over repeated 
indices like i and pp (Einstein summation convention). Let the Lagrangian 


L = 30, 9)0" 9; —V(Q), (10.317) 
contain an SO(N)-invariant potential V, typically of the form (with gy? = rN 1 9?) 


m A 
V(Q) =-Ze +e, (10.318) 


where A > 0, but m? may have either sign. If m? <0, the minimum of V lies at 
@ =0, but if m? > 0 the minima form the SO(N)-orbit through 


p = (v,0,--- ,0); (10.319) 
vem/VA =||9°|. (10.320) 


The idea is that the physical fields are excitations of the “vacuum state” °, so that, 
instead of @, as the appropriate “small oscillation” field one should use 


(x) = Y(x)— GF. (10.321) 


Consequently, the potential is expanded in a Taylor series for small ¥ as 


V(9) =V (9) +4Viuixj +O(2?): (10.322) 
a2Vv : 
= c 

1 = Jq00,; (9°). (10.323) 


Note that the linear term vanishes because V'(@°) = 0. We now use the SO(N)- 
invariance of V, i.e., V(g9) =V(@) for all g € SO(N). For T, € g (ie. the Lie 
algebra of G, realized by anti-symmetric traceless N x N matrices) this yields 


(Ta)ij Qj = 9. (10.324) 


Differentiation with respect to @, and putting @ = @° then gives 
Vic (Ta)iiP§ = 0. (10.325) 


In general, let H C G be the stabilizer of @°, i-e., g € H iff gp* = O°. In our exam- 
ple (10.318) - (10.319), we evidently have H = SO(N — 1). Then T,° = 0 for all 
generators J, of the Lie algebra § of H, so that there are 


M = dim(G) — dim(H) = dim(G/H) = dim(G- @°) (10.326) 
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linearly independent null eigenvectors of V” (seen as an N x N matrix). This number 
equals the dimension of the submanifold of RY where V assumes its minimum. In 
our example we have M = N — 1, since dim(SO(N)) = 4N(N — 1). We now perform 
an affine field redefinition, based on an affine coordinate transformation in R% that 
diagonalizes the matrix V”. The original (real) fields were 9 = (@1,...,@y), and the 
new (real) fields are (71, 02,--- , Oy), with 


M1 =O1-v, (10.327) 


as in (10.321), and the Goldstone fields are defined, also in general, by 
1 - 1 “ 
8a = —(Ta". 9) = = (Ta)is 95 (10.328) 


Here (-,-) denotes the inner product in R%, and we have chosen a basis of g in which 
the elements (T),..., Taim(H) ) form a basis of h, completed by M further elements 
(Taim(#)+1+-++Taim(G)+1)> 80 as to have basis of g. The index a in (10.328), then, 
runs from dim(H) + 1 to dim(G), so that there are M Goldstone fields, cf. (10.326). 
In our running example, this number was shown to be M = N — 1, and in view of 
(10.319), the field 6, = (T,)i1@; is a linear combination of the @ till oy. 

The simplest example is N = 2, with potential (10.318) and m? > 0. With the 
single generator T = —ioy, we obtain 0 = @. Since V" = diag(2m”,0), we see 
that the mass term —1m* 9? in (10.318) (with @? = 9? + 3) changes from the 
“wrong” sign —m” to the ‘right’ sign +2m? in (10.322), whilst — 4m 3 in (10.318) 
disappears, so that the field @ comes out to be massless. Indeed, this is the point 
of the introduction of the Goldstone fields: in view of (10.325) and (10.328), the 
Goldstone fields do not occur in the quadratic term in (10.322) and hence they are 
massless, in satisfying a field equation of the form 0,0" 0, = ---, where --- does not 
contain any term linear in any field. This proves the classical Goldstone Theorem: 


Theorem 10.26. Suppose that a compact Lie group G C SO(N) acts on N real 
scalar fields 0 = (@1,...,@n), leaving the potential V in the Lagrangian (10.317) 
invariant. If G is spontaneously broken to an unbroken subgroup H C G (in the 
sense that the stability group of some point @° in the G-orbit minimizing V is H), 
then there are at least dim(G/H) massless fields, i.e., there is a field transformation 


(Q1,---,On) > (%1,---;Hn-M, 91,---,0y4) (M =dim(G)—dim(H)), (10.329) 


that is invertible in a neighborhood of @ = 9°, such that the potential V(@), re- 
expressed in the fields x and @, has no quadratic terms in 0. 


The local invertibility of the field redefinition around @° 4 0 is crucial; in our ex- 
ample, where 7 = %; = 9; — vv and 0, = Tq, this may be checked explicitly. 
An alternative proof of Theorem 10.26 uses nonlinear Goldstone fields, viz. 


1 
P(x) = er #™ (p+ x(x), (10.330) 
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where the sum over a (implicit in the Einstein summation convention) ranges from 1| 
to M, v = ||@°||, and the fields y¥ = (%1,---,%—m) are chosen orthogonal (in R%) to 
each T,Q,, a = 1,...,M, and hence to the 6,. Provided that the generators of SO(N) 
(and hence of G C SO(N)) have been chosen such that 


(Ta, Typ) = v5”, (10.331) 


the fields 0% defined by (10.330) coincide with the fields in (10.328) up to quadratic 
terms in ¥ and @; to see this, expand the exponential and also use the fact that both 
(Ta9°, @°) and (T,9°,%) vanish. This transformation is only well defined if v 4 0, 
i..e., if SSB from G to H occurs, and its existence implies the Goldstone Theorem 
10.26, for by (10.330) and G-invariance, V(@) is independent of 0. 


The Goldstone Theorem can be derived in quantum field theory, but in the spirit 
of this chapter we will discuss it rigorously for quantum spin systems. Far from 
considering the most general case, we merely treat the simplest setting. We assume 
that A is a quasi-local C*-algebra given by (8.130), with H = C”. Furthermore: 


1. The group of space translations Z4 acts on A by automorphisms 7,, and so does 
the group R of time translations by automorphisms % commuting with the 7, (cf. 
89.3); we often write 0, ,) for O% oT; as well as a(x,t) for a 0 7(a). 

2. A compact Lie group G acts on H = C” through a unitary representation u and 
hence acts on on A by automorphisms 7, as in (10.58) - (10.59), such that 


Ye 0 M(x. 1) = Ux) Ye ((x,t) € Z4 x R,g €G). (10.332) 


OW 


. There exists a pure translation-invariant ground state @. 
4. One has SSB in that Wo ¥ # w for all g € Gg C G, where 


Gq = {exp(sT,),s € R,T, € g}. (10.333) 


5. There is an n-tuple @ = (@1,...,@n) of local operators @q_ € M,,(C) that trans- 
forms under G by @ +> ugQug = Y2(@), and defines an order parameter , by 


bq = 549 = 7 (Yexp(sta)(P)) j.-0° (10.334) 
at least for SSB of G, (as above) in that, cf. Definition 10.6, 
O(du~) £0. (10.335) 
6. Writing j? = iw’ (T,) € M,(C), it follows that 6,¢ = —i[j°, @], and hence that 


5.p(x) =—i lim VL), e@)] @EZ), (10.336) 
A/T EA 


since by (8.132) (i.e., Einstein locality) only the term y = x will contribute. Physi- 
cists then wish to define a charge by Qa = Lyeza j2(y) and write (10.336) as 
59 (x) = —i[Qa, (x)], but Q, does not exist precisely in the case of SSB! 
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Eq. (10.336) motivates the crucial assumption for the Goldstone Theorem, viz. 


o(d.9(x,t)) =—i lim Y @([i(y),(%.)]) (XE Z4,4ER), (10.337) 
AL EQ 


which incorporates the condition that the sum over y converge absolutely. 
Although (10.337) at first sight softens (10.336) in turning an operator equation 
into a numerical one, in fact (10.337) decisively sharpens (10.336) by involving 
the time-dependence of @, whose propagation speed should be sufficiently small 
for enabling the limit in (10.336) to catch up with the limit in (10.337). As such, 
eq. (10.337) is satisfied with short-range forces, but the Meissner effect in su- 
perconductivity and the closely related Higgs mechanism in gauge theories (both 
of which circumvents the Goldstone Theorem) are possible precisely because in 
those cases (10.337) fails (at least in physical gauges, see also §10.10). 
7. Finally, we make two assumptions just for convenience, namely 


Pa(x)" = Pa(x); (10.338) 
(Pa(x)) = 0. (10.339) 


If these are not the case, one could simply take real and imaginary components 
of @q and/or redefine Qg, as Pa = Yo, — O( Pa) - 1,4, so that (Pq (x)) = 0. 


The Goldstone Theorem provides information about the joint-energy momentum 
spectrum of the theory at hand. To define this notion, we exploit the fact that from 
assumption no. (3) and Corollary 9.12 we obtain a unitary representation u@ of 
the (locally compact) abelian space-time translation group A = Z4 x R on the GNS- 
representation space Hj induced by @. The SNAG-Theorem C.114 applied to A, with 
dual A = Té x R (cf. Proposition C.108), then yields a projection-valued measure 


eo: B(Rx T’) > A(Ho), (10.340) 


as a map from the Borel sets in R x T¢ to the projection lattice in B(H), such that 


Lis, =f, ff acte,w: (10.341) 


U@(y,t) =e de(E, kj) (ye Z4,t ER). (10.342) 


Here k = (ki,...,lg),y-k= ye yikj, and we have reduced the integration range over 
E (which a priori would be R) to Rt. Indeed, by Stone’s Theorem we have u@(t) = 
exp(ithw), where O(hw) C [0,°°) because @ is a ground state by assumption, and 
the support of e is evidently contained in Z¢ x o(hg) (cf. Definition A.16). 


Definition 10.27. The joint energy-momentum spectrum 0 (/@, Pq) of a space- 
time invariant state @ (i.€., © 0 O&,4) = @, (x,t) € Z4 xR) is the support of the 
projection-valued measure €q associated to the GNS-representation Tq, i.e., the 
smallest closed set 6(h@, Pw) C T4 x R such that e((T4 x R)\o(he@, Po)) = 0. 
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The notation o(h@, Pa) is purely symbolic here, since (as opposed to the continuum 
case) the group Z4 of spatial translations is discrete and hence has no generators py. 
Since u@ (x,t) Qa = Qo, the origin (0,0) certainly lies in o(h@, pw), with 


€@(0,0) = |Qe) (Qo, (10.343) 


which by Theorem 9.14 is the unique T¢ x R-invariant state in Hy». Denoting this 


contribution to @~@ by e), in many physical theories one has ew = eo) + el) Petts 


(1) 


where e¢)° is supported on the graph of some continuous function k +> €& > 0, Le., 
{(k,€),k € T7} C o(hg, Po) CT? xR. (10.344) 


The joint energy-momentum spectrum may be studied in part by considering 


fle.) = fare a((20).068.)) 


yezd 


= 21D fate Nm (Qo, meal J2(0))euw(9)%0(Ga(0)) 20) 
yezay © 


= [., [ ((Qo.m0 (4810) )dew(E.k) to (Pa(0)) 2u)8(€-E)5(p~b) 
— (QoTo(Pa(0) dew (Ek) Ka (ja(0)) Qa) 5(€ +E)S(p+k)), (10.345) 


i.e., the Fourier transform of the two-point function defined by j° and @, which is 
a distribution on the dual group T¢ x R; for the third equality we used a distribu- 
tional version of the Fourier inversion formula (C.382). For example, if we replace 
€o(E,k) by el) (E,k), then, since el dis absolutely continuous with respect to Haar 
measure d“k on T“, we see that f(€, p) is proportional to 6(€ — €,). 

Theorem 10.28. Under assumptions 1-7 (notably (10.337) and SSB of some contin- 
uous symmetry), the Hamiltonian hg has continuous spectrum starting at zero and 


: sae! 1 . : 
hence has no gap. If there is an excitation spectrum el) as explained above, with 


J (2A HO) dels EK) to(Pa(0)) Qo) #0. (10.346) 


then the continuous function k ++ € defining the spectrum satisfies &) = 0. 


Proof, Since the sum in (10.337) converges absolutely, the Fourier transform f (t,p) 
of y++ @([7°(y), @(x,t)]) in y alone is continuous in p, and by (10.337) we have 


i@(6,0(x,t)) = f (t,0). (10.347) 


By (10.332), the left-hand side is independent of x and f, hence the Fourier transform 
f(€,0) of the right-hand side in ¢ is proportional to 6(€). Since (10.343) does not 
contribute to f by (10.339), the calculation (10.345) shows that f(€,0) = 0 if o(hg) 
has a gap. But f(¢€,0) 40 by (10.335), and so o(/q) has no gap. Similarly, for the 
final claim note that f(€,0) ~ d(€ — €9) as well as f(€,0) ~ d(€). 
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10.10 The Higgs mechanism 


We proceed to a discussion of SSB in gauge theories, especially with an eye on the 
Higgs Mechanism, which plays a central role in the Standard Model of high-energy 
physics (whose empirical confirmation was more or less finished with the discovery 
of the Higgs boson at CERN, announced on July 4, 2012). 

We look at the Abelian Higgs Model, given by the Lagrangian 


L =-1Fi +4(Di9,Di,0) -V(9), (10.348) 
where @ = (@1, 2) is a scalar doublet, the usual electromagnetic field strength is 
Fuy = OyAy — OyAp, (10.349) 
in terms of which F Fi = FyyF HV and the covariant derivative is 


Di, = Oy — eAy -T = dy- 1p +ieAy - OD. (10.350) 
Here e is some coupling constant, identified with the unit of electrical charge. We 
still assume that V only depends on ||@||* = (@, @) and hence is SO(2)-invariant. 

The novel situation compared to (10.317) and the like is that, whereas (10.317) is 
invariant under global SO(2) transformations, the Lagrangian (10.348) is invariant 
under local SO(2) gauge transformations that depend on x, namely 


(x) 4 et) Te(x) = Gas ae) ae (10.351) 


sin a(x) cos a(x) Q2(x) 


rae te ics * ay at(x) (10.352) 


We say that the local gauge group Y = C*(IR4,U(1)) acts on the space of fields 
(A,@) by (10.351) - (10.352). Now suppose V has a minimum at some constant 
value p° # 0. In that case, any field configuration 


exp(a(x)-T)@°; (10.353) 
Au(x) = (1/e)Opa(x)) (@ € FY), (10.354) 


minimizes the action. Hence the possible “vacua” of the model comprise the 
(infinite-dimensional) orbit Y of the gauge group through (A = 0,9 = @°). Note 
that Di? = 0 for (A,@) € ¥,ie., @ is covariantly constant along the vacuum orbit 
(whereas for global symmetries it is constant full stop). Relative to the (arbitrary) 
choice (0, p°) € ¥, we then introduce real fields 7 and 0, called the Higgs field and 
the would-be Goldstone boson, respectively, by (10.330), which now simply reads 


(St) Sere Ce: (10.355) 
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After this redefinition of the scalar fields, the Lagrangian (10.348) becomes 
L=-1F3 + ldyzoty + le*(v+7)°ByBY —V(v+x,0), (10.356) 


where By = Ay — (1/ev)Oy@, and Pe = FyyF" for Fuy = 0yBy — 0yBy. This de- 
scribes a vector boson B with mass term Im2ByB", with ma = Ley > 0 (as op- 
posed to the massless vector field A), and a scalar field 7 with mass term im; x, 
with ms, = (0°V /99?)i~.0) > 0 (since V supposedly has a minimum at @° = (v,0)). 
This is the Higgs mechanism: the gauge field becomes massive, whilst the mass- 
less (“would-be”) Goldstone boson disappears from the theory: it is (allegedly) 
“eaten” by the gauge field. Thus the scalar degree of freedom @ that seems lost 
is recovered as the longitudinal component of the massive vector field (which for a 
gauge field would have been an unphysical gauge degree of freedom, see below). 


In the description just given, the Higgs mechanism in classical field theory is 
seen as a consequence of SSB. Remarkably, there is an alternative account of the 
Higgs mechanism, according to which it has nothing to do with SSB! Namely, we 
now perform a field redefinition analogous to (10.355) etc. straight away, viz. 


Qi(x) \ _ O(x):T Gur 
Grey =e 0 )3 (10.357) 
Au = Bu +(1/e)ou@. (10.358) 


This transformation is defined and invertible in a neighbourhood of any point 
(Po, 90,Bo, where po > 0, 69 € (—2, 2), and Bo is arbitrary. Each of these new fields 
is gauge-invariant: for the gauge transformation (10.351) becomes 


(x) + O(x) + a(x); (10.359) 
p(x) p(x), (10.360) 


and in view of (10.352), B does not transform at all. The Lagrangian becomes 
L=—-1F3+1d,pa"p + 1e?p?B,B" —V(p), (10.361) 


with V(p) =V(p,0). This is a Lagrangian without any internal symmetries at all 
(not even Zz, since p > 0), but of course one can still look for classical vacua that 
minimize the energy and hence the potential V(p). If p = 0 is the absolue mini- 
mum, then the above field redefinition is a fortiori invalidated, but if V’(v) = 0 for 
some v > 0, we proceed as before, introducing a Higgs field v(x) = p(x) — v, and 
recovering the Lagrangian (10.356). This once again leads to the Higgs mechanism. 

This can be generalized to the nonabelian case; since it suffices to explain the 
idea, we just discuss the SU (2) case. In (10.348), the scalar field @ = (1, @2) is now 
complex, forming an SU (2) doublet, the brackets (-,-) now denote the inner product 
in C?, the nonabelian gauge field is A = A“o, (where the Pauli matrices 0g, a = 
1,2,3, form a self-adjoint basis of the Lie algebra of SU(2)), with associated field 
strength Fyy = OyAy — OyAy + g[Ay,Av] and covariant derivative Di = dy +igAp. 


426 10 Spontaneous Symmetry Breaking 


With F? = F4, Fy * the Lagrangian (10.348) is invariant under the transformations 
‘A LV grang 


G(x) 4 el (x); (10.362) 
Ay (x) el 2) (Ay (x) — (i/g)Ay eM), (10.363) 


The definition of the gauge-invariant fields B and p a la (10.357) - (10.358) is now 


Gea — Palo. Gar (10.364) 
Ay (x) = el (By (x) — (i/g) dye", (10.365) 


which leads, mutatis mutandis, to the very same Lagrangian (10.361). 


As a compromise between these two derivations of the Higgs mechanism, it is 
also possible to fix the gauge by picking the representative (@,A) in each Y-orbit for 
which @2(x) = 0 and (x) > 0; note that this so-called unitary gauge is ill-defined 
if (x) = 0. Calling this unique representative (p,B), we are again led to (10.361). 

Gauge field theories are constrained systems, in which the apparent degrees of 
freedom in the Lagrangian are not the physical ones. For free electromagnetism, 
the Lagrangian is 2(A) = —}FuyF"”, with Fuy = OyAy — OyAy. In terms of the 
gauge-invariant fields E; = Fi9 = 0;Ao — OoA; and B = V x A, Maxwell’s equations 


V-E=0; (10.366) 
dE/ot=V xB; (10.367) 
OB 
SS VRE: 10. 
> xE; (10.368) 
V-B=0, (10.369) 


then arise as follows: eqs. (10.366) and (10.367) correspond to the Euler-Lagrange 
equation for Ag and Aj, respectively, whereas (10.368) and (10.369) immediately 
follow from the definitions of B and E in terms of A. The Maxwell equations are in 
Hamiltonian form, with canonical momenta IT, = 02/ 0A us this yields IT; = —E;, 
as well as the primary constraint IIp = 0. Nonetheless, the canonical Hamiltonian 


n= [as (My Ap (e) ~ 20) = f d?x(QE) + 4B") Aol) -EG)) 


is well defined. In the Hamiltonian formalism, Gauss’ Law resurfaces as the sec- 
ondary constraint stating that the primary constraint be preserved in time, viz. 


; oh = 
p(x) = — SAg(a) =V-E(x) =0. (10.370) 
Since 
40 .E(x) = —0;(6h/5A;(x)) = —0;(AA; — 0;V-A) = 0, (10.371) 
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there are no “tertiary” constraints. Thus we have canonical phase space variables 
(E,A) and (II, Ao), subject to (10.366) and to I(x) = 0 for each x € R?, ie., 


TIp(Ao) = y d>xTIp(x)Ag(x) = 0; (10.372) 


II(A) = [@x9-BQ)AG) =i (10.373) 


for all (reasonable) functions Ag and A on R?. The constraints (10.372) - (10.373) 
are first class in the sense of Dirac, which means that their Poisson brackets are 
equal to existing constraints (or zero). In the Hamiltonian formalism, the role of the 
space-time dependent gauge transformations of the Lagrangian theory is played by 
the canonical transformations generated by the first class constraints, 1.e., 


5y,A0(x) = {To(Ao),Ao(x)} = Ao(x); (10.374) 
5, Ai(x) = 54 E(x) = 0; (10.375) 
5, A(x) =VA(x); (10.376) 
5, E(x) =0; (10.377) 
5, Ao(x) =0. (10.378) 


The holy grail of the Hamiltonian formalism is to find variables that are both 
gauge invariant and unconstrained. In our case, Ay = (Ao, A) are unconstrained but 
gauge variant, whilst I7,, = (IIp, —E) are gauge invariant but constrained! Now write 
some vector field V as V = V' + V", where V’ = A~!V(V-V) is the longitudinal 
component, so that V,’ = (6;; -A~!0;0;)V; is the transverse part. Then the physical 
variables of free electromagnetism are A’ and E’. The physical Hamiltonian 


pay fate" Ta? aM) A037 


then, is well defined on the physical (or reduced) phase space, which is the subset 
of all (Ay, IT,) where the constraints (10.373) hold, modulo gauge equivalence. 


After this preparation, we now revisit the abelian Higgs model as a constrained 
Hamiltonian system. It is convenient to combine the two real scalar fields @ and @ 
into a single complex scalar field @ = (1 +i@2)/ V2, and treat @ and its complex 
conjugate @ as independent variables. The Lagrangian (10.348) then becomes 


L =F, +DAQ- Dig —V(@,9), (10.380) 
with Die = (d, —ieA,)@, etc. The conjugate momenta IT, to Ay are the same as 
for free electromagnetism, i.e., [Ij = 0 and I; = —E;, and for @ we obtain 

t=0L/09 =D} 9; (10.381) 


%=d0L/09 = DQ. (10.382) 
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The associated Hamiltonian h is equal to 
pes (4E’ + 1B? — Ao(V-E— jo) +2%27+Di0 Dig +V(@.9)) , (10.383) 


where jo = ie(%7@ — 7@) is the zero’th component of the Noether current. Hence 
the primary constraint remains IIy = 0, but the secondary constraint picks up an 
additional term and becomes V - E = jo (which remains Gauss’ law!). The physical 
(i.e., gauge invariant and unconstrained) variables can be computed as 


Ga = VAG, By =e HAAG, (10.384) 


Tt, = eA IVAG FE, = cid VAR (10.385) 


plus the same transverse fields A? and E’, as in free electromagnetism. In terms of 
the transverse covariant derivative D? = 0; — ieA? , the physical Hamiltonian h is 


Jax(: }(ET-E—A?. AAT — jjA~' jj) + Fata + DI @a- Di Oa +V(9a;9s)) 
(10.386) 
The third term in (10.386) is the Coulomb energy, in which the charge density 
Jd = ie(Ha Qa — TQ) (10.387) 


is the same as jo (since the latter is gauge invariant). Remarkably, the physical field 
variables carry a residual global U(1)-symmetty, viz. 


Qa t> explid) Pa (10.388) 
Ta +> exp(—i wie (10.389) 

@, > exp(—i@)@,; (10.390) 
Ta > exp(ia)7,, (10.391) 


and no change for A? and E’, under which the Hamiltonian (10.386) is invariant. 
If V has a minimum at @ = @ = v, we recover the Higgs mechanism: redefining 


Oa = exp(i0/v)(v+ x), (10.392) 

and complex conjugate, and the reintroduction of the longitudinal components 
Al = —(1/ev)d,0; E’ =—evA~'dj\n, (10.393) 
of the gauge field and its conjugate momentum, the Hamiltonian (10.386) becomes 


(V-E) 
e2y2 


1 fax (E+ B+ aly ana + FPA VID), (10.394) 


where A = A? + A¥ and E = E? +E. This describes a massive vector field, and 
the would-be Goldstone boson @ has disappeared, as befits the Higgs mechanism! 
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It is fair to say that the Higgs mechanism in quantum field theory—and more 
generally, the notion of SSB in gauge theories—is poorly understood. Indeed, the 
entire quantization of gauge theories is not well understood, except at the perturba- 
tive level or on a lattice. The problems already come out in the abelian case with 
d = 3. The main culprit is Gauss’ Law V -E = jo. One would naively expect this 
constraint to remain valid in quantum field theory as an operator equation, and this is 
indeed the case in so-called physical gauges like the Coulomb gauge (i.e. 0;A; = 0). 
If we now look at condition (10.337) in §10.9, which for G = U(1) and for example 
51 = @ and 5@) = —Q for acharged field — = (¢~ +ig)/V2, or SQ = ig, reads 


lim | d°y@([jo(y,0), Pa(x,t)]) = —i@(6 @a(x,t)), (10.395) 
AZAR JA 


then it is clear that (10.395) can only hold if charged fields are nonlocal. For by 
Gauss’ Law the commutator [jo(y,0), @a(x,t)] equals [V-E(0,y), @q(x,t)], and by 
Gauss’(!) Theorem in vector calculus, all contributions to the left-hand side of 
(10.395) come from terms [E;(0,y), @a(x,t)], with y € JA (ie., the boundary of 
A). These must remain nonzero if A 7” IR3, at least if (10.395) holds. On the other 
hand, such nonlocality must be enforced by massless fields, which idea leads to one 
of the very few rigorous result about the Higgs mechanism (in the continuum): 


Theorem 10.29. In the Coulomb gauge the following conditions are equivalent: 


e The electromagnetic field A is massless; 

e Eq. (10.395) holds for any field Qq; 

e The charge operator Q = lim, +R3 J, dy jo(y,0) exists (on some suitable domain 
in Hq containing Qa) and satisfies QQy = 0. 


Hence (contrapositively), SSB of U(1) by the state @ is only possible if A is massive. 
In that case, the Fourier transform of the two-point function (0| @q(x,x0) j§(¥,yo)|0) 
(cf. the proof of the Goldstone Theorem 10.28 in 810.9) has a pole at the mass of A. 


This theorem indeed yields the Higgs mechanism for say the abelian Higgs model 
in a specific physical gauge: note that the idea that the would-be Goldstone boson is 
eaten by the gauge field is already suggested by Gauss’ Law, through which (minus) 
the canonical momentum E to A acquires jg as its longitudinal component; that is, 
the very same field that creates the Goldstone boson from the ground state. 

In covariant gauges, all fields remain local, but (10.395) is rescued by the gauge- 
fixing term added to the Lagrangian. For example, adding %¢ = —(1/2€)(dyA")? 
to (10.348) leads to an equation of motion OF = jy —0,0,A", so that (discarding 
all surface terms by locality), one obtains 


—i@(5@a(x,t)) = fe dy @([A5A0(y,0), Par(x,t)]). (10.396) 


In the proof of the Goldstone Theorem, the massless Goldstone bosons do emerge, 
but they turn out to lie in some “unphysical subspace” of H@ (which, for local 
gauges, is not a Hilbert space but has zero- and negative norm states). 
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Notes 


In a philosophical context, the notion of emergence is usually traced to J.S. Mill 
(1843), who drew attention to ‘a distinction so radical, and of so much importance, 
as to require a chapter to itself’, namely the one between what Mill calls the prin- 
ciple of the “Composition of Causes’, according to which the joint effect of several 
causes is identical with the sum of their separate effects, and the negation of this 
principle. For example, in the context of his overall materialism, Mill believed that 
although all ‘organised bodies’ are composed of material parts, 


‘the phenomena of life, which result from the juxtaposition of those parts in a certain man- 
ner, bear no analogy to any of the effects which would be produced by the action of the 
component substances considered as mere physical agents. To whatever degree we might 
imagine our knowledge of the properties of the several ingredients of a living body to be 
extended and perfected, it is certain that no mere summing up of the separate actions of 
those elements will ever amount to the action of the living body itself.’ 

Mill (1952 [1843], p. 243) 


Mill launched what is now called British Emergentism (Stephan, 1992; McLaugh- 
lin, 2008; O’Connor & Wong, 2012), a school of thought which seems to have ended 
with C.D. Broad, who has our sympathy over Mill because of the doubt he expresses 
in our quotation in the preamble. Among the British Emergentists, the most modern 
views seem to have been those of S. Alexander, who, as paraphrased in O’Connor 
& Wong (2012), was committed to a view of emergence as 


‘the appearance of novel qualities and associated, high-level causal patterns which cannot be 
directly expressed in terms of the more fundamental entities and principles. But these pat- 
terns do not supplement, much less supersede, the fundamental interactions. Rather, they 
are macroscopic patterns running through those very microscopic interactions. Emergent 
qualities are something truly new (...), but the world’s fundamental dynamics remain un- 
changed.’ 


Alexander’s idea that emergent qualities ‘admit no explanation’ and had ‘to be ac- 
cepted with the “natural piety” of the investigator foreshadowed the later notion 
of explanatory emergence. Indeed, philosophers distinguish between ontological 
and epistemological reduction or emergence, but ontological emergence seems a 
relic from the days of vitalism and other immature understandings of physics and 
(bio)chemistry (including the formation of chemical compounds, which Broad and 
some of his contemporaries still saw as an example of emergence in the strongest 
possible sense, i.e., falling outside the scope of the laws of physics). Recent liter- 
ature, including the present chapter, is concerned with epistemological emergence, 
of which explanatory emergence is a branch. For example, Hempel wrote: 


‘The concept of emergence has been used to characterize certain phenomena as ‘novel’, and 
this not merely in the psychological sense of being unexpected, but in the theoretical sense 
of being unexplainable, or unpredictable, on the basis of information concerning the spatial 
parts or other constituents of the systems in which the phenomena occur, and which in this 
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context are often referred to as “wholes”. (Hempel, 1965, p. 62) 


See also Batterman (2002), Bedau & Humpreys (2008), Norton (2012), Silberstein 
(2002), Wayne & Arciszewski (2009), and many other surveys of emergence. 
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§10.1. Spontaneous symmetry breaking: The double well 

The facts we use about the double-well Hamiltonian may be found in Garg (2000) 
or Landau & Lifshitz (1977) at a heuristic level (but with correct conclusions), or, 
rigorously, in Reed & Simon (1978), Simon (1985), Helffer (1988), and Hislop & 
Sigal (1996). Theorem 10.2 is Theorem XIII.47 in Reed & Simon (1978). 


§10.2. Spontaneous symmetry breaking: The flea 

The flea perturbation and its effect on the ground state were first described in 
Jona-Lasinio, Martinelli, & Scoppola (1981a,b), who used methods from stochas- 
tic mechanics. See also Claverie & Jona-Lasinio (1986). Using more conventional 
methods, their results were reconfirmed and analyzed further by e.g. Combes, Duc- 
los, & Seiler (1983), Graffi, Grecchi, & Jona-Lasinio (1984), Helffer & Sjéstrand 
(1985), Simon (1985), Helffer (1988), and Cesi (1989). The “Flea on the Elephant” 
terminology used by Simon (1985) motivated the title of Landsman & Reuvers 
(2013), who, as will be explained in the next chapter, identified the proper host 
animal as a cat. All pictures in this section are taken from the latter paper (and 
were prepared by the second author). For the Eyring—Kramers formula see Berglund 
(2011) for mathematicians or Hanggi, Talkner, & Borkovec (1990) for physicists. 


§10.3. Spontaneous symmetry breaking in quantum spin systems 

The translation-non-invariant ground states mentioned after Proposition 10.5 are 
discussed e.g. in Example 6.2.56 in Bratteli & Robinson (1997). See also Liu & 
Emch (2005), which was in important source for this section, and Ruetsche (2011) 
for a discussion of the definition of SSB through non-implementability. For order 
parameters see e.g. Sewell (2002), §3.3. A proof of Proposition 10.8 may be found 
in Bratteli & Robinson (1997), Proposition 6.2.15. 


§10.4. Spontaneous symmetry breaking for short-range forces 

The idea of SSB goes back to Heisenberg(1928). The C*-algebraic approach in 
quantum spin systems with short-range forces is reviewed in Bratteli & Robin- 
son (1997); see also Nachtergaele (2007). Theorem 10.10 is due to Araki (1974); 
see also Simon (1993), Theorem IV.5.6, and Bratteli & Robinson (1997), Theorem 
6.2.18. In Definition 10.9, Araki required Qq to be separating for 7@(A)” instead of 
@ to be a-invariant, but in the presence of (10.53) and hence (10.53) these condi- 
tions are equivalent. The fact that (for short-range forces) global Gibbs states defined 
by (10.43) satisfy the KMS condition follows from Theorem 10.10, but this was the 
starting point of Haag, Hugenholtz, & Winnink (1967); see Winnink (1972). 

Uniqueness of KMS states for one-dimensional quantum spin systems with short- 
range forces at any positive temperature (which also holds for the classical case, e.g. 
the one-dimensional Ising model) has been proved by Araki (1975). See also Mattis 
(1965) and Altland & Simons (2010) for some of the underlying physical intuition. 


§10.5. Ground state(s) of the quantum Ising chain 

Theorem 10.11.1 was first established in Pfeuty (1970) by explicit calculation, 
based on Lieb, Schultz, & Mattis (1961). For more information on the quantum Ising 
model (also in higher dimension) see e.g. Karevski (2006), Sachdev (2011), Suzuki 
et al (2013), and Dutta et al (2015). Uniqueness of the ground state of the quantum 
Ising model with B 4 0 holds in any dimension d, as first shown by Campanino, 
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Klein & Perez (1991) on the basis of Perron—Frobenius type arguments similar to 
those for Schrédinger operators. The singular case B = 0 leads to a violation of the 
strict positivity conditions necessary to apply the Perron—Frobenius Theorem, and 
this case indeed features a degenerate ground state even when N < -~. 

The overall picture of SSB described in this section arose from the work of Horsch 
& von der Linden (1988), Kaplan, Horsch, & von der Linden (1989), Kaplan, von 
der Linden, & Horsch (1990), and especially Koma & Tasaki (1993, 1994). See also 
van Wezel (2007, 2008), van Wezel & van den Brink (2007), and Fraser (2016). 

The analogy between the quantum Ising chain and the double-well potential may 
not be surprising physically, since the latter was originally derived from the former: 
in potassium dihydrogen phosphate, i.e. KH2POq, each proton of the hydrogen bond 
would reside in one of the two minima of an effective double-well potential origi- 
nating in the oxygen atoms, if it were not for tunneling, parametrized by the field B, 
which at small values yields a symmetric ground state (De Gennes, 1963). 


§10.6. Exact solution of the quantum Ising chain: N < 

The general set-up to this solution is due to Lieb, Schultz, & Mattis (1961), and 
was adapted to the quantum Ising by Pfeuty (1970), with further details by Karevski 
(2006). The complex solution go was already noted by Lieb et al. The energy split- 
ting in higher dimensions does not seem to be known, but Koma & Tasaki (1994, 
eq. (1.5)) expect similar behaviour as in d = 1. 


§10.7. Exact solution of the quantum Ising chain: N = 

The solution described in this section is due to Araki & Matsui (1985), where 
further details may be found; this is a highlight of modern mathematical physics! 
Theorem 10.20 is due to Araki (1987), although such results have a long history 
going back to Shale & Stinespring (1964, 1965). For a very clear exposition see 
Ruijsenaars (1987). See also Evans & Kawahigashi (1998), Chapter 6. 

The reason the one-sided chain A = N is problematic is that although the bosonic 
algebra ® jcjM>(C) and its fermionic counterpart CAR((?(N)) are well defined, and 
are isomorphic through the Jordan—Wigner transformation (10.102) - (10.103), the 
limiting dynamics has no simple form on either A or F’, because the Fourier trans- 
form of ¢?(N) is the Hardy space H*(—z,7) of L?-functions with positive Fourier 
coefficents, instead of the usual L(—7,7). Unlike on L?, The energies sgn, of the 
fermionic quasiparticles do not define a multiplication operator on H?. 


§10.8. Spontaneous symmetry breaking in mean-field theories 

The Poisson structure on S(B) was introduced by Bona (1988) and more gen- 
erally by Duffield & Werner (1992a); see also Bona (2000). Theorem 10.22 and 
Corollary 10.23 are due to Duffield & Werner (1992a). The symplectic leaves of the 
given Poisson structure on S(B) (for which notion see e.g. Marsden & Ratiu (1994) 
or Landsman (1998a)) were determined by Duffield & Werner (1992a): Two states 
p and o lie in the same symplectic leaf of .7(B) iff p (a) = o(uau*) for some uni- 
tary u € B. If p and o are pure, this is the case iff the GNS-representations 7p (B) 
and 7,(B) are unitarily equivalent, cf. Thm. 10.2.6 in Kadison & Ringrose (1986). 
In general the implication holds only in one direction: if p and o lie in the same 
leaf, then they have unitarily equivalent GNS-representations. 
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Our survey of equilibrium states of homogeneous mean-field models is based on 
Fannes, Spohn, & Verbeure (1980) and Bona (1989). For rigorous results on the 
Curie-Weiss model see Chayes et al (2008) and Ioffe & Levit (2013). Numerical 
evidence for the restoration of Butterfield’s Principle may be found in Botet, Julien 
& Pfeuty (1982) and Botet & Julien (1982), which are up to N ~ 150, and Vidal et 
al (2004), which reaches N = 1000. Note that experimental samples have N < 10. 

In the context of the BCS model of superconductivity in the strong coupling 
limit), the Hamiltonian, he in (10.287) or hg in (10.294) is called the Bogoliubov— 
Haag Hamiltonian, after Bogoliubov (1958) and Haag (1962). Further contribu- 
tions to mean-field theories include Thirring & Wehrl (1967), Thirring (1968), Hepp 
(1972), Hepp & Lieb (1973), van Hemmen (1978), Rieckers (1984), Morchio & 
Strocchi (1987), Duffner & Rieckers (1988), Bona (1988, 1989, 2000), Unnerstall 
(1990a, 1990b), and Sewell (2002). For a nice proof of Theorem 10.25, which orig- 
inates in Fannes, Spohn, & Verbeure (1980) and Bona (1989), see Gerisch (1993). 

Even in the absence of a global KMS condition for @, one is justified in in- 
terpreting the primary states (of )° as pure thermodynamic phases of the given 
infinite quantum system, whose thermodynamics is described by the “phase space” 
S(M,(C)). Though somewhat against the spirit of Bohrification (according to which 
the commutative C*-algebra C(M,,(C)) is the right one to look at), the argument 
can be strengthened by enlarging A to A ®C(M,,(C)) (where the choice of the ten- 
sor product does not matter, since C(M,,(C)) is commutative and hence nuclear, see 
§C.13). This larger C*-algebra was introduced by Bona (1990), who proved: 


Theorem 10.30. 1. There is a unique time-evolution & on A®C(M,(C)) such that 
for any primary permutation-invariant state @ on A and a € A one (strongly) has 


Jim te (ety aetw ) = Ty(0,(a)). (10.397) 


2. The states @? and ob in (10.286), which are defined on A, extend to the tensor 


product A®C(M,(C)) as @F ® Up and of @ db, respectively, and as such satisfy 
the KMS condition at inverse temperature B with respect to the dynamics Q. 


810.9. The Goldstone Theorem 

There is a large amount of literature on the Goldstone Theorem, both heuris- 
tic and rigorous. The former started with Goldstone, Salam, & Weinberg (1962), 
whereas the latter originates in Kastler, Robinson, & Swieca (1966); see also Buch- 
holz et al (1992). For a survey, see Strocchi (2008, 2012), whose approach (based on 
Morchio & Strocchi, 1987) we follow. See also Berzi (1979, 1981), Landau, Perez, 
& Wreszinski (1981), Fannes, Pule, & Verbeure (1982), and Wreszinski (1987). 


§10.10. The Higgs mechanism 

The original reference is Higgs (1964ab). Our discussion is based on Lusanna 
& Valtancoli (1996ab) and Struyve (2011), both of whom derive the physical vari- 
ables in the abelian Higgs model. See also Rubakov (2002), Strocchi (2008), where 
Theorem 10.29 may be found, and Stéltzner (2014) for some history and sociology. 


Chapter 11 
The measurement problem 


The measurement problem of quantum mechanics was probably born in 1926: 


‘Thus Schrédinger’s quantum mechanics gives a very definite answer to the question of the 
outcome of a collision; however, this does not involve any causal relationship. One obtains 
no answer to the question “what is the state after the collision,” but only to the question 
“how probable is a specific outcome of the collision” (in which the quantum-mechanical 
law of [conservation of] energy must of course be satisfied). This raises the entire problem 
of determinism. From the standpoint of our quantum mechanics, there is no quantity that 
could causally establish the outcome of a collision in each individual case; however, so far 
we are not aware of any experimental clue to the effect that there are internal properties of 
atoms that enforce some particular outcome. Should we hope to discover such properties 
that determine individual outcomes later (perhaps phases of the internal atomic motions)? 
Or should we believe that the agreement between theory and experiment concerning our in- 
ability to give conditions for a causal course of events is some pre-established harmony that 
is based on the non-existence of such conditions? I myself tend to relinquish determinism in 
the atomic world. But this is [also] a philosophical question, for which physical arguments 
alone are not decisive.’ (Born, 1926a, p. 866; translation by the author) 


In other words, quantum mechanics stipulates that the state after some collision (or 
measurement) is Y = )°, Cn Wn, whereas experiment demonstrates that in fact the fi- 
nal state is just one of the y;,, with (Born) probability lcn|?. Quantum mechanics, 
then, seems unable to account for single outcomes of experiments and has to satisfy 
physicists with merely probabilistic predictions. This, in a nutshell, is the measure- 
ment problem—although very substantial analysis is needed to flesh it out. 

Giving up determinism was soon incorporated in the Copenhagen Interpretation 
of Bohr and Heisenberg (cf. the Introduction) and more broadly became part of 
what might be called “orthodoxy”, which represents the apparent (but not actual) 
consensus among Bohr, Heisenberg, Pauli, Born, Jordan, Dirac, von Neumann, and 
many others, which they supposedly reached around 1930 after the formal com- 
pletion of quantum mechanics. This “orthodoxy”, which later gave rise to the un- 
fortunate “shut up and calculate” attitude most physicists seem to have (especially 
towards the measurement problem), should be distinguished from the Copenhagen 
Interpretation. For example, von Neumann never endorsed the doctrine of classical 
concepts, which in the above attitude has been replaced by the different and far more 
superficial idea that it is the entire goal of physics to explain experiments. 
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11.1 The rise of orthodoxy 


Even within the strict Copenhagen Interpretation, there were sharp differences be- 
tween Bohr and Heisenberg, beyond the one concerning classical concepts reviewed 
in the Introduction. However, it seems that they agreed about the following point 
made by Bohr in his Como lecture concerning measurement: 


‘According to the quantum theory, just the impossibility of neglecting the interaction with 
the agency of measurement means that every observation introduces a new uncontrollable 
element.’ (Bohr, 1928, p. 584) 


This placed measurement squarely outside quantum mechanics for the second time: 
the first time was in the insistence that the measurement device (“‘if it is to serve 
its purpose”) had to be described classically (cf. the Introduction), and now we also 
learn that the interaction between the quantum object undergoing measurement and 
the apparatus in question is “uncontrollable”, despite the fact that Bohr and Heisen- 
berg regarded quantum mechanics as a complete theory: their argument was ap- 
parently that precisely the classical nature of the apparatus makes the interaction 
uncontrollable. This in turn justified the classical description of the device, in that 
registration of a measurement result ought to be “objective”, so that reading it out 
by performing a measurement on the apparatus, so to speak, should not introduce 
any further disturbance and hence uncontrollability (or so the argument goes). 

Consistent with Bohr’s point, a more detailed conceptual analysis of the measure- 
ment process was given by Heisenberg (1958, pp. 46-47, 54-55), who consistently 
refers to the quantum state or wave-function as the “probability function”’: 


‘Therefore, the theoretical interpretation of an experiment requires three distinct steps: 


1. the translation of the initial experimental situation into a probability function; 

2. the following up of this function in the course of time; 

3. the statement of a new measurement to be made of the system, the result of which can 
then be calculated from the probability function. 


(...) After [the] interaction [with the measuring device] has taken place, the probability 
function contains the objective element of tendency and the subjective element of incom- 
plete knowledge, even if it has been a “pure case” before [i.e., it has become a mixture]. 
It is for this reason that the result of the observation cannot generally be predicted with 
certainty; what can be predicted is the probability of a certain result of the observation, 
and this statement about the probability can be checked by repeating the experiment many 
times. (...) The observation itself [i.e., the act of registration of the result by the mind of the 
observer] changes the probability function discontinuously; it selects of all possible events 
the actual one that has taken place. Since through the observation our knowledge of the sys- 
tem has changed discontinuously, its mathematical representation also has undergone the 
discontinuous change and we speak of a “quantum jump.” 


Here we find the typical Copenhagen view of measurement as a two-step process: 


1. Measurement turns an initial pure state (of the measured object) into a mixture; 
2. One term in this mixture is singled out (by Nature and thence by the observer). 


Note that Heisenberg’s last comment puts him squarely into the camp of what is 
now called “QBism” (i.e., Quantum Bayesianism, see §11.2 below)! 
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Von Neumann (1932, §VI.1) gave a more formal (and highly influential) presen- 
tation of the (alleged) two stages of the measurement process: 


‘In the discussion so far we have treated the relation of quantum mechanics to the various 
causal and statistical methods of describing nature. In the course of this we found a pe- 
culiar dual nature of the quantum mechanical procedure which could not be satisfactorily 
explained. Namely, we found that on the one hand a state @ is transformed into the state @’ 
under the action of an energy operator H in the time interval 0 < t <1: 
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Therefore, as a consequence of the causal change of @ into @’ the [pure] states U = Pig] 
[=|¢)(9|] go over into the [pure] states U’ = Pig (process 2 in V.1.). On the other hand, 
the state 6—which may measure a quantity with discrete spectrum, distinct eigenvalues and 
eigenfunctions @1, @2, ... undergoes in a measurement a non-causal change in which each 
of the states $1, @2, ...can result, and in fact does result with the respective probabilities 
|(@, 01), |(@, @2)|*, .... That is, the mixture 
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obtains (...) (process 1 in V.1.). Since the [pure] states [i.e. Proj] go over into mixtures, 
the process is not causal. The difference between these two processes U ++ U’ is a very 
fundamental one: aside from their different behaviors in regard to the principle of causality, 
they are also different in that the former is (thermodynamically) reversible, while the latter 
is not.’ (pp. 417-418 in von Neumann (1955); translation: R.T. Beyer) 


All this concerns merely the first stage of the measurement, in which a pure state 
is transformed into a mixed one. The second stage, in which a single outcome is 
obtained, is already alluded to above (though clouded by von Neumann’s ensemble 
language), but is described (in prose) later on through what is now called a von Neu- 
mann chain: one redefines system plus apparatus as the system, and couples it to a 
new apparatus, etc. This chain supposedly ends with the “ego” of the “individual” 
whose “intellectual inner life” is finally responsible for a single outcome. 

It is very remarkable that von Neumann nowhere seems to use the central Copen- 
hagen dogma that the apparatus be described classically (cf. the Introduction), espe- 
cially since the mathematics of operator algebras he was inventing at almost exactly 
the same time is tailor-made for incorporating this dogma (which fact indeed forms 
the motivation for the present book). One clue for his lack of enthusiasm may come 
from the very end of his book (i.e., §VI.3), where he challenges ‘an explanation 
often proposed to account for the statistical character of the process 1’, namely the 
idea that (the non-unitary) process | might have its origin in an initial mixed state of 
the apparatus. Indeed, even if the apparatus as a quantum-mechanical system is in a 
pure state (as any system should be ontologically), its description as a classical sys- 
tem generally renders its state mixed—and the same conclusion may be drawn on 
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epistemic grounds, arguing that the state of macroscopic or otherwise complicated 
systems cannot be known exactly. Many writings by the Copenhagen school, then, 
suggest that the alleged unanalyzable nature of the measurement and the random- 
ness of its outcome should be attributed to the classical description of the apparatus 
and its ensuing mixed state, including our earlier quotation (cf. §8.4) from Heisen- 
berg (1958) on the origin of probabilities in quantum mechanics: 


‘these uncertainties (...) are simply a consequence of the fact that we describe the experi- 
ment in terms of classical physics’ (Heisenberg, 1958, p. 53) 


To counter this argument, von Neumann argues that physics requires the (Born) 
probabilities for the various outcomes to depend only on the initial state @ of the 
quantum system undergoing measurement (as opposed to the state of the apparatus, 
be it classical or quantum), whereas any “process 2” (i.e. unitary) time evolution 
would merely push the coefficients w, in the (alleged) mixed apparatus state into the 
role of probabilities for the possible outcomes. However, “the w, are characteristic 
of the observer alone (and therefore independent of @)’, and hence 


‘the non-causal nature of the process 1. is not produced by any incomplete knowledge of 
the state of the observer.’ (von Neumann, 1955, p. 439). 


Von Neumann’s argument became the mother of all “insolubility theorems” for the 
measurement problem, some of which will be reviewed in §11.3 below. 

Pauli (1933, §9) also includes some comments on measurement and the interpre- 
tation of quantum mechanics in general. These display a bizarre hybrid between the 
ideas of Bohr and von Neumann, somehow mediated by Heisenberg. Thus Pauli en- 
dorses (even starts with) some notion of Complementarity, but he relates this to the 
mathematical formalism rather than to the doctrine of classical concepts (which he 
nowhere invokes). Similarly, his treatment of measurement on the one hand follows 
the disturbance ideology of Bohr and Heisenberg (but without grounding this in the 
classical description of the apparatus), whilst technically he quotes and follows von 
Neumann, claiming that measurement leads to mixtures which subsequently reduce 
to one term through ‘ein besonderer, naturgesetzlich nicht im Voraus determinierter 
Akt?’ (i.e., special process that does not follow deterministic laws of nature). A rather 
more systematic review of early measurement theory was written by London & 
Bauer (1939), whose opening is highly promising and almost poetic: 


‘The majority of introductions to quantum mechanics follow a rather dogmatic path from 
the moment that they reach the statistical interpretation of the theory. In general they are 
content to show, by more or less intuitive considerations, how the actual measuring devices 
always introduce an element of indeterminism, as this interpretation demands. However, 
care is rarely taken to verify explicitly that the formalism of the theory, applied to that 
special process which constitutes the measurement, truly implies a transition of the system 
under study to a state of affairs less fully determined than before. A certain uneasiness 
arises. One does not see exactly with what right and up to what point one may, in spite of 
this loss of determinism, attribute to the system an appropriate state of its own. Physicists 
are to some extent sleepwalkers, who try to avoid such issues and try to concentrate on 
concrete problems. But it is exactly these questions of principle which nevertheless interest 
nonphysicists and all who wish to understand what modern physics says about the analysis 
of the act of observation itself.” (London & Bauer, 1939, pp. 218-219) 
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Yet the authors mainly repeat von Neumann’s analysis (confirming its lofty status): 


‘The interaction with the apparatus does not put the object into a new pure state. Alone, 
it does not confer to the object a new wave function. On the contrary, it actually gives 
nothing but a statistical mixture: It leads to one mixture for the object and one mixture for 
the apparatus. For either system regarded individually there results uncertainty, incomplete 
knowledge. Yet nothing prevents our reducing this uncertainty by further observation. 

And this is our opportunity. So far we have only coupled one apparatus with one object. 
But a coupling even with a measuring device is not yet a measurement. A measurement is 
achieved only when the position of the pointer has been observed. It is precisely the increase 
of knowledge, acquired by the observation, that gives the observer the right to choose among 
the different components of the mixture predicted by the theory, to reject those which are not 
observed, and to attribute thenceforth to the object a new wave function, that of the pure case 
which he has found. We note the essential role played by the consciousness of the observer 
in this transition from the mixture to the pure state. Without his effective intervention, one 
would never obtain a new y function.’ (ibid., p. 251) 


Accordingly, at the end of the golden era of quantum mechanics, the view of mea- 
surement as a two-stage process in which a pure state is first transformed into a mix- 
ture in a more or less scientific way, upon which unanalyzable and possibly mental 
phenomena bring about a single outcome, was firmly established, although—the 
point deserves to be repeated—in their formal treatments neither von Neumann nor 
London & Bauer incorporated the key claim Bohr and Heisenberg made about mea- 
surement, namely that the corresponding apparatus must be described classically. 
Opponents of the Copenhagen Interpretation (the most prominent among whom 
were Einstein and Schrédinger) were well aware of this tension between formalism 
and ideology, which in the form of Schrédinger’s Cat even reached immortality (!): 


‘One may also construct highly burlesque cases. A cat is confined in a box of steel together 
with the following hellish machine (which one should secure against a direct attack by the 
cat): A Geiger counter contains a tiny amount of radioactive material, so little that during 
one hour possibly one of its atoms decays, but equally likely also none does; if it does, then 
the counter is triggered and activates, via a relais, a little hammer which breaks a small 
container of hydrocyanic acid. Having left this system to itself for one hour, one will say 
that the cat is still alive if meanwhile no atom has decayed. The first decay of an atom 
would have poisoned her. The y-function of the entire system would express this in such 
a way that in it the living and the dead cat would be mixed or spread out on equal terms. 
What is typical about these cases is that an uncertainty which is originally limited to the 
atomic domain has been transformed into a coarse-grained uncertainty, which may then 
be decided by direct observation. This prevents us from regarding a “‘faded model” as an 
image of reality in such a naive way. As such [this model] contains nothing that is unclear 
or contradictory. There is a difference between a moved or poorly focused photograph and 
a record of clouds and fog banks.’ (Schrodinger, 1935, p. 812; translation by the author) 


The last sentence is particularly powerful, contrasting Schrddinger’s (as well as Ein- 
stein’s) view that physics should describe some sharply defined reality (of which 
quantum mechanics at best produces blurred pictures) with the Copenhagen view, 
according to which reality itself lacks focus (with quantum mechanics providing the 
best possible picture of it). This contrast confirms our idea that Schrédinger’s Cat 
metaphor specifically draws attention to the problems that arise from the Copen- 
hagen “duality postulate” that macroscopic systems (such as measurement devices 
and cats) admit both a classical and a quantum-mechanical description. 
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11.2 The rise of modernity: Swiss approach and Decoherence 


Despite Schrédinger’s Cat, the measurement problem was not an active field of re- 
search until Wigner (1963) rekindled interest in the topic. Even so, his paper mainly 
reiterated von Neumann’s views—which already had been repeated by London and 
Bauer—including his omission of the doctrine of classical concepts. In particular, it 
continued to promulgate the suggestion that measurement is a two-step process for 
which the clarification of the first step (i.e. of turning a pure state into a mixture) 
would already be a major part of the solution of the measurement problem. 
Wigner’s paper inspired for example the “‘Swiss” approach to the measurement 
problem, which was remarkable in being the first serious mathematical attempt to 
take into account the Bohr—Heisenberg dogma that the apparatus be described classi- 
cally, whilst also paying tribute to von Neumann in insisting on mathematical rigour. 
Indeed, the Swiss approach relies on the formalism of operator algebras, which also 
marks a conceptual break with all earlier—and indeed most later—approaches in 
taking the observables rather than the states as a starting point. The aim of the Swiss 
approach is to show that relative to a suitable class of observables, the pure state 
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coincides with the corresponding mixture without the off-diagonal terms, i.e., 
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Thus the ambition of this approach is limited, in that no attempt is made to explain 
(at least the appearance of) single outcomes, except by appealing to the ignorance 
interpretation of probability (in vain, see below). The alleged equivalence between 
pure states and mixtures can typically be achieved if the apparatus is infinite and 
the measurement time is infinite, too. The infinite character of the apparatus (here 
seen as an idealization of a macroscopic device, as is standard in quantum statistical 
mechanics), is no guarantee for its classicality, but it is certainly a step in the right 
direction (cf. Chapter 8). Thus two closely related problems must be overcome: 


1. In its reliance on superselection sectors (technically, on disjoint states on a suit- 
able algebra of observables of the apparatus, see Definition 8.18), the program 
only works in the limit of infinite apparatus and infinite measurement time. In- 
deed, any approximation ruins the equivalence between pure states and mixtures; 
and hence even this limited solution to the problem violates Earman’s Principle. 

2. In so far as the subsequent problem of obtaining single outcomes to measurement 
is recognized in the Swiss approach at all, it seems to be addressed by an appeal to 
the ignorance interpretation of probability. Despite the fact that the mathematical 
situation in this respect is better than in ordinary quantum mechanics (where 
the ignorance interpretation of the formal probability distribution given by the 
coefficients in a diagonal density operator is nonsensical, if only because the 
state space is not a simplex), there is still no valid argument for this move. 
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To explain the last point, we quote Leggett (though somewhat out of context): 
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‘Now, following Schrodinger, let us consider a thought experiment in which the quantum- 
mechanical description of the final state, as obtained by appropriate solution of the time de- 
pendent Schrdinger equation, contains simultaneously nonzero probability amplitudes for 
two or more states of the universe that are, by some reasonable criterion, macroscopically 
distinct (in Schrédingers example, this would be “cat alive” and “cat dead’’). Of course, just 
about everyone, including me, would accept that because of, inter alia, the effects of deco- 
herence, it is likely to be impossible, at least for the foreseeable future, to experimentally 
demonstrate the interference of such states. (On the other hand, as the late John Bell was 
fond of pointing out, the foreseeable future is not a very well-defined concept. In fact, as 
late as 1999, not a few people were confidently arguing that because of the inevitable ef- 
fects of decoherence, the projected experiments to demonstrate interference at the level of 
flux qubits would never work. In this case, the foreseeable future lasted approximately one 
year. As Bell used to emphasize, the answers to fundamental interpretive questions should 
not depend on the accident of what is or is not currently technologically feasible.) But the 
crucial point is that the formalism of quantum mechanics itself has changed not one whit 
between the microscopic and macroscopic levels. Are we then entitled to embrace, at the 
macrolevel, an interpretation that was forbidden at the microlevel, simply because the ev- 
idence against it is no longer available? I would argue very strongly that we are not, and 
would therefore draw the conclusion: also at the macrolevel, when the quantum-mechanical 
description assigns simultaneously nonzero [probabilities] to two or more macroscopically 
distinct possibilities, then it is not the case that each system of the relevant ensemble realizes 


either one possibility or the other.’ (Leggett, in Schlosshauer, 2011, p. 155) 


This argument of Leggett’s (which is a special case of Earman’s Principle) was orig- 
inally targeted at decoherence, but it also applies verbatim to the Swiss approach 
(which is closely related to decoherence, as both heavily rely on limits and super- 
selection rules—which are absolute in the former and dynamically induced in the 
latter). In an even earlier hunch of Earman’s Principle, Bell— this time aiming di- 
rectly at the Swiss approach—in fact made a related point about its reliance on the 
t —> co limit (in that even at extremely large but finite time the state remains pure). 
Jumping to the modern era, a striking point of continuity with the 1920s and 
1930s is the idea that the measurement procedure (and hence the measurement prob- 


lem) consists of two stages; only the terminology and the scope have changed: 


‘There are two distinct measurement problems in quantum mechanics: what Pitowsky has 
called a “big” measurement problem and a “small” measurement problem. The “big” mea- 
surement problem is the problem of explaining how measurements can have definite out- 
comes, given the unitary dynamics of the theory: it is the problem of explaining how in- 
dividual measurement outcomes come about dynamically. The “small” measurement prob- 
lem is the problem of accounting for our familiar experience of a classical, or Boolean, 
macroworld, given the non-Boolean character of the underlying quantum event space: it is 


the problem of explaining the dynamical emergence of an effectively classical probability.’ 


(Bub, in Schlosshauer, 2011, pp. 145-146) 


Clearly, the “small” measurement problem is modern parlance for the problem 
how to turn a superposition into a mixture, upon which the “big” problem—if it is 
noticed at all—still concerns the old issue of selecting one term from this mixture. 

Furthermore, the measurement problem seems to have acquired increased scope 


and importance, as exemplified by the following quotations: 
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‘One of the most ancient philosophical questions (Heidegger thought is was the question) is 
this: why is there something rather than nothing? In terms of events rather than substances, 
the question would be: how come anything happens at all? That question is the measurement 
problem.’ (Fine, in Schlosshauer, 2011, p. 146) 


‘The measurement problem has been called “the reality problem” by Philip Pearle. This is 
a better name for it. We perceive objects in the world as being in definite states. A door 
is either open or shut, a given ball either is in a given box or it is not. The wave function, 
however, can have superpositions of these things, suggesting that the door can be simultane- 
ously open and shut at the same time, and that the ball can be both in the box and not in the 
box at the same time. The reality problem is that there is a discrepancy between the version 
of reality we perceive, and the version presented to us by the most obvious interpretation of 
the wave function.’ (Hardy, in Schlosshauer, 2011, p. 153) 


‘Fundamentally, the measurement problem is the problem of connecting probability with 
truth in the quantum world, that is to say, it is the problem of how to relate quantum probabil- 
ities to the objective occurrence and non-occurrence of events. The problem arises because 
there appears to be a difficulty in reconciling the objectivity of a particular measurement 
outcome with the entangled state at the end of a measurement.’ (Bub, ibid., p. 145) 


More technically, the measurement problem has come to be seen as a special case 
of the problem of explaining at least the appearance of the classical world from 
quantum theory. If the measurement problem is seen from the Copenhagen perspec- 
tive this is eminently reasonable, as both problems involve the dual description of 
either the apparatus or the world around us as both classical and quantum (and its 
possible failure). In this context, an alleged solution to the “small” problem, such as 
Decoherence, is often also seen as this explanation (as if there were no issue about 
the derivation of the laws of classical physics, including the dynamical ones). 

A propos, another characteristic feature of the modern era is undoubtedly the 
dominance of Decoherence (if only over the Swiss approach), for example: 


‘I think the whole discussion about whether measurements in quantum mechanics are in- 
deed problematic somewhat misses the point. Measurement interactions are only one of 
many examples of quantum interactions that lead to superpositions of macroscopically dis- 
tinct states. Nature has been producing macroscopic superpositions for millions of years, 
well before any quantum physicist cared to artificially engineer such a situation. The key 
concept here is decoherence. Environmental interactions tend to produce superpositions of 
classically distinct states. This raises the issue of how one could describe a classical regime 
in quantum mechanics, quite irrespective of the existence of measuring apparatuses. (...) 

If decoherence and its applications had been developed early in the history of quantum 
theory, then the idea that measurements play a special role in the theory might not have 
risen to such prominence, and the foundations of quantum mechanics would have focused 
instead on the problem of how to derive a classical regime within the theory.’ 
(Bacciagaluppi, in Schlosshauer, 2011, p. 143) 


Mathematically, decoherence boils down to the idea of adding one more link to 
the von Neumann chain (see §11.1) beyond S+A (i.e. the system and the apparatus). 
Conceptually, however, there is a fundamental conceptual as well as technical dif- 
ference between Decoherence and older approaches that took such a step: whereas 
previously (e.g., in the hands of von Neumann, London & Bauer, and Wigner) the 
chain converged towards the observer, in Decoherence it diverges away from the 
observer. Namely, the third and final link is now taken to be the environment. 
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This notion is often taken in a fairly literal sense in agreement with the intuitive 
meaning of the word, but it may also (we would even say: preferably) refer to inter- 
nal degrees of freedom of the apparatus, as in the Spehner-Haake model in §11.4. 
Either way, the “environment” is usually treated as an infinite system (necessitating 
a limit like N — oo), which (in simple models where the pointer has discrete spec- 
trum) has the consequence that the post-measurement state Y,,CnWn © On @ Xn (in 
which the 7, are mutually orthogonal) is only reached not only in the limit N — oo 
of infinitely many degrees of freedom but also in the limit t — of infinite time. In 
that case, the restriction of the above state to S+-A (i.e. the trace of the corresponding 
density operator over the degrees of freedom of the environment) is mixed, which 
means that the quantum-mechanical interference between the states y,, ® @, for dif- 
ferent values of n has become “delocalized” to the environment, and accordingly is 
deemed irrelevant if the latter is not observed (i.e. omitted from the description). 

Unfortunately, in so far as it claims to provide a solution to the measurement 
problem, Decoherence is an unmitigated disaster: 


1. Decoherence actually aggravates the measurement problem: where previously 
this problem was believed to be man-made and relevant only to rather unusual 
laboratory situations, it has now become clear that “measurement” of a quantum 
system by the environment (instead of by an experimental physicist) happens 
everywhere and all the time: hence it remains even more miraculous than before 
that there is a single outcome after each such measurement. 

2. Even the need for one of the two limits N — oo or t > o makes Decoherence 
vulnerable to Earman’s Principle; see Bell’s and Leggett’s critiques above. 

3. Like the Swiss approach, Decoherence suffers from the difficulty that even if it 
were able to reach its goal of reducing pure states to mixtures (about which ability 
one may have doubts), there is no sound follow-up step to solve the next problem 
of selecting one term from the mixture produced in the previous step. The igno- 
rance interpretation seems blocked by Leggett’s argument quoted above (i.e. his 
continuity argument to the effect that Decoherence just removes the evidence for 
a given Schroédinger’s cat state to be a superposition, elsewhere charging those 
claiming that Decoherence solves the measurement problem of committing the 
logical fallacy that removal of the evidence for a crime would undo the crime). 


Thus Decoherence is parasitic on some interpretation of quantum mechanics that 
solves the measurement problem, which in turn is typically strengthened by it. In 
this context, the most popular of these has been the Everett (i.e., Many-Worlds) 
Interpretation, which, after decades of obscurity or even derision, suddenly started to 
be greeted with a flourish of trumpets in the wake of the popularity of Decoherence. 
However, even if such extravagant interpretations are coherent, these should in our 
opinion be a very last resort, acceptable only if truly everything else has failed. 

On the positive side, Decoherence has led to the important idea of einselection 
(for environment-induced superselection), where a pure state y of some system 
(possibly plus apparatus) is “einselected” if it remains pure after coupling to the 
environment and subsequent restriction. The hope (or rather program), then, is to 
show that classical states are classical precisely because they are robust in this way. 


444 11 The measurement problem 


Finally, it may be appropriate to close this historical introduction to the measure- 
ment problem by mentioning another modern approach, namely outright denial: 


‘I remember giving a talk at a meeting at the London School of Economics seven or so years 
ago. In the audience was an Oxford philosophy professor, and I suppose he didn’t much like 
my brash cowboy dismissal of a good bit of his life’s work. When the question session came 
around, he took me to task with the most proper and polite scorn I had ever heard (I guess 
that’s what they do). “Excuse me. You seem to have made an important point in your talk, 
and I want to make sure that I have not misunderstood anything. Are you saying that you 
have solved the measurement problem? This problem that has plagued quantum mechanics 
for seventy-five years? The message of your talk is that, using quantum information theory, 
you have finally solved it?” (Funny the way the words could be put together as a question, 
but have no intended usage but as a statement.) I don’t know that I did anything but turn the 
screw on him a bit further, but I remember my answer. “No, not me; I havent done anything. 
What I am saying is that a “measurement problem” never existed in the first place. (...) 


The “measurement problem” is purely an artefact of a wrong-headed view of what quan- 
tum states and/or quantum probabilities ought to be. (...) quantum states are not real things 
from a Quantum Bayesian view (...) but a personal judgment, a quantified degree of belief. 
A quantum state is a set of numbers an agent uses to guide the gambles he might take on 
the consequences of his potential interactions with a quantum system. It has no more sub- 
stantiality than that. Aren’t epistemic states real things? Well ... yes, in a way. They are as 
real as the people who hold them. But no one would consider a person to be a property of 
the quantum system he happens to be contemplating. And one shouldn’t think of a quantum 
state in that way either—one shouldnt think of it as a property of the quantum system to 
which it is assigned. Take the source of the paradox away, we say, and the paradox itself 
will go away.’ (Fuchs, in Schlosshauer, 2011, pp. 146-147) 


These words have been quoted at some length, because the view that “physics is 
information” and its alleged corollary that all foundational problems are solved by 
Bayesian reasoning (perhaps with a quantum flavour) is becoming increasingly pop- 
ular. Physicist are now seen as punters (or, in academic parlance, “agents”) who 
in smoky offices bet on the outcomes of experiments, and hence use (quantum) 
Dutch Book arguments to justify some sort of strictly epistemic (quantum) proba- 
bility calculus. However, the ideology of “QBism” thus expressed appears to have 
adopted precisely the weakest ingredients of the Copenhagen Interpretation—viz. 
the idea that the wave-function is just a catalogue of the probabilities for possible 
outcomes of measurements whose details are supposedly beyond our grasp, cf. the 
Introduction—at the expense of its one strong component, namely the doctrine of 
classical concepts. Although there may have been pragmatic reasons for this atti- 
tude in the 1920s, (mathematical) physics has moved forward since then, enabling 
much more detailed analysis and hence justifying considerably greater ambition in 
understanding the measurement process than Bohr and Heisenberg cum suis had. 

In any case, the fact that one competent author regards the measurement problem 
as the key to reality whilst another flatly denies even its very existence should give 
pause for thought. As in the Bohr—Einstein debate, different perspectives on reality 
and on the task of physics seem to play a role here, culminating in contrasting views 
of quantum-mechanical states: the more “reality” one attributes to states, the more 
serious the measurement problem is. Or, contrapositively, the more operationalist 
one’s attitude, the further the problem disappears behind the horizon. 
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11.3 Insolubility theorems 


Since in §11.4 we will “propose the impossible”, namely miraculously solving the 
measurement problem within unitary quantum mechanics, it is helpful to review the 
arguments why this is generally felt to be impossible. Such arguments take the form 
of so-called insolubility theorems. As already mentioned, such theorems ultimately 
go back to von Neumann: especially those that prove the impossibility of explaining 
his process | (i.e. the transition from a pure state to a mixture) from process 2 
(unitary time evolution according to the Schrédinger equation). Another kind of 
insolubility theorem shows that single outcomes are impossible from process 2. 

It might be argued that both kinds of theorem add little to the basic mathematical 
intuition behind the measurement problem, which is as follows (it goes without 
saying that we disagree with this traditional description of measurement, see below). 
Let s € B(Hs) be the observable being measured (where Hs is some Hilbert space 
associated to a quantum object S undergoing measurement) and let a € B(H,) be 
a “pointer observable” correlated to S (where Hy, is a second Hilbert space). In 
particular, the measurement apparatus A is described quantum mechanically. For the 
moment we assume both Hilbert spaces to be finite-dimensional and both operators 


to be non-degenerate, even having the same spectrum {A),...,A,}; this of course 
S 


implies that dim(Hs) = dim(H,4) =n. Thus Hs has a basis (v;°’) of eigenvectors of 
(2), 


s and likewise Hy has a basis (v; 


;) of eigenvectors of a, with svi = ave and 


av = av (i= 1,...,n). The (erroneous) argument, then, is as follows: 


1. Measurement should establish a correlation between values of s of S and values 
of a of A, which with the above labeling implies that for each i the initial sys- 


(s) should push the pointer from some initial state yi) into a final 


(post-measurement) state vo, Hence the dynamics, described by some unitary 


operator u € B(Hs ® H,), should be such that 


tem state D 


uv! @ wo) =v! @v = @. (11.1) 


2. If the initial system state is y(*) = Y,c;v\") (with Y,; |ci|? = 1), then, by linearity 
of u, the final state is 9 = );c;Q;. But if A is sufficiently macroscopic this con- 
flicts with observation, which always shows one of the terms in the sum. In other 
words, in theory, a—more precisely, 1};, ®a—has no value in this state, whereas 
in practice it does, since in the real world measurements do have outcomes. 

3. Hence the final state should be the mixed density operator ¥; |c;|7|@;) (@;| (rather 
than the pure one |@) (@|), whose ignorance interpretation (allegedly) yields one 
of the states @; with probability |c;|?. But it is impossible to transform the initial 
pure state |@o)(@po| into the above mixture by any unitary operator, let alone by 
the u defined by (11.1), which by construction yields 


ulvs)(Ws le’ = |9)(91A Lei?) (@- (11.2) 
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As we already discussed, for some authors the measurement problem is the clash be- 
tween nos. | and 3 (this is the “small” problem), whereas for others it is the conflict 
between nos. | and 2 (i.e. the “big” one). Either way, the goal of insolubility theo- 
rems is to show that the problem is not a consequence of idealizations in primitive 
arguments like the one just given, but remains even under very general assumptions. 
In particular, both the purity of the initial system as well as apparatus states (and 
hence of their tensor product), and the exact system-apparatus correlation assumed 
(including the premise of point spectra and finite-dimensional Hilbert spaces), can 
be considerably relaxed. To illustrate the kind of discussion, we present one example 
of an insolubility proof along the former lines and one along the latter. These proofs 
even remain valid if the notion of an observable itself is relaxed, too, namely from 
a self-adjoint operator to a POVM (see (2.178)), but we will not discuss this utmost 
generality (if only because it would not circumvent our critique below). It should be 
noted that insolubility theorems tacitly assume that the mathematical objects in the 
quantum-mechanical formalism describe all there is physically. 

In the first direction, we have Theorem 11.2 below, which we may summarize as 
the problem of statistics: there is a contradiction between the following postulates: 


I. System and apparatus are both described quantum-mechanically. 

2. The wave-function of the system is complete. 

3. The wave-function always evolves linearly (e.g., by the Schrodinger equation). 

4. Measurements with identical initial wave-functions may have different out- 
comes, and the probability of each possible outcome is given by the Born rule. 


Here the second and third postulates may be consequences of the first, but even so 
it is useful to list them separately, since denying or circumventing nos. 1, 2, and 3 is 
typically done in completely different ways (see the end of this section). 

Formally, let s = s* € B(Hs) be an arbitrary self-adjoint operator on an arbitrary 


(separable) Hilbert space Hs, with associated spectral projections e € Y(Hs), 
A C o(s), and likewise a € B(H4). It is convenient (and entails no genuine loss of 
generality) to still assume that o(s) = o(a). Recall that the Born measure us? on 
the spectrum o(s) induced by some density operator ps € (Hs) is given by 


Hs.) (A) = Tr (pe?) = as (eS?) = u82(A), (11.3) 


cf. (4.9), where Ws is the state associated to ps by (2.33), and no notational confusion 
between pf) and ns should arise (they are the same thing). Likewise for a. 


Definition 11.1. 7. Let H be a Hilbert space and let b € B(H)sa. Two (normal) 
states @,@' on B(H) are called b-distinguishable if ue ) # pw; in other words, 
there is some A C o(b) such that ue) (A) F p® (A). Similarly for p,p' € Z(H). 

2. In the situation described before (11.3), a pair (pa,u), where Pa is a density 
operator on B(H,) and u is a unitary operator on Hs ® Ha, is a measurement 
scheme for s if s-distinguishability of two density operators Ps, Ps on Hs implies 
1, ® a -distinguishability of the two states u(Ps ® pa)u* and u(py®@ pa)u*. 
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3. A measurement scheme (Pa,u) for s preserves probabilities if for any density 
operator ps © Z(Hs) the probability measure on 6(a) = O(14, ® a) induced by 


u(Ps ® Pa)u* equals the Born measure us on 0(s) = 0(a) induced by ps. 
4. A density operator p © Y(Hs ® Ha) objectifies the pointer observable a relative 
to some countable partition o(a) = ||; Aj of its spectrum if p = Yj piev,, where 


each unit vector 0; © Hs ® Hg is an eigenvector of 1x, we (pi > 0, ¥; pi = 1). 


For example, in case of a discrete spectrumf or simplicity, if A; 4 Az in o(d), then 
any two unit eigenvectors W) (i = 1,2) give rise to b-distinguishable vector states 


pi = [vs vo}, If y= cv”) +c,v) with |c,|* +|c2|? =1 and c; 40,1, then 
also the trio (P1,P2,@y) is pairwise b-distinguishable. If, the other hand, A € o(b) 
is degenerate, then ey and ey fail to b-distinguishable whenever y, w cM. 
Clause 2 of Definition 11.1—which incorporates a vast number of at least theo- 
retical scenario’s—is a considerable weakening of the scheme (11.1), while clause 
3 sharpens the second, implying that measurement transfers all Born probabilities 
for the object to the apparatus, probabilistically making the latter a mirror image 
of the former. Clause 4 firstly takes care of continuous spectra; if o(a) is discrete, 
one may simply partition it by its points (a partition of o(a) is sometimes called 
a reading scale). The “objectification” terminology is questionable (if not outright 
misleading), as it is motivated by the ignorance interpretation of mixtures (see be- 
low), but we follow the literature in using it. In what follows, we exclude the trivial 
cases where o(s) consist of a single point, and/or o(a) is partitioned by itself. 


Theorem 11.2. For any nontrivial object observable s and partitioning of o(a), 
there exists no measurement scheme (Pa,u) for s whose final state u(ps ® pa)u* ob- 
jectifies a for any initial system state Ps (let alone one that preserves probabilities). 


Proof. Since we will not use this theorem (except for pointing out that it attacks a 
straw man), we just prove it in the special case where o(a) is discrete and parti- 
tioned by its points, and also the spectral decomposition p4 = Y, Prey Of the initial 
apparatus state is unique, cf. (B.490). For any unit vector in v°) € Hs we then have 


ule.) ® pa)us = V pyuleyi) @en)ur. (11.4) 
n 


Take A; # Az in o(s), with associated eigenvectors vi) and vf), If €n = |Qn) (On|, 


for unit vectors a, € Ha, then objectification of a requires that each of the vectors 
Ss Ss Ss Ss 
u(v\) @ Om), u(vf? @ a), u((c1v\ +e7)v @ on), 
with |c)|? + |c2|* = 1 and c; 40, 1, must be an eigenvector of 1y, ®a. This is only 
possible if the first two vectors (and hence the third) lie in the same eigenspace for 
14, a, but in that case condition no. 2 in Definition 11.1 is violated, since the three 
given initial system states are pairwise s-distinguishable whereas the corresponding 
outcomes states just listed evidently fail to be 17, ® a-distinguishable. 


448 11 The measurement problem 


Insolubility theorems of the second kind describe the problem of outcomes, ac- 
cording to which clauses 1., 2., and 3. of the problem of statistics also contradict: 


4’. Measurements have determinate outcomes. 


Technical statements to this effect are even more straightforward than those for- 
malizing the problem of statistics. We keep Hs and s € B(Hs) as they were, but this 
time, H, may refer to the rest of the Universe outside the quantum object described 
by Hs (which includes the pointer, of course). Here is the key assumption. 


Definition 11.3. Let s € B(Hs)sq be an object observable with partition o(s) = 
Llicer Ai of its spectrum (if o(s) = {A1,...} is discrete, one may take A; = {A;}), 
and let H, be a second Hilbert space. A sound measurement scheme consists of: 


e Acollection (S;)icr of outcome spaces, i.e. subsets of the (normal) state space, 
S; C S,(Hs ® Hy) = D(H ® Hy), (11.5) 
for which there is0 <1 < 1/2 such that for i # j, one has 
21-7 < ||a;— || <2 (@; € S;,@; € S;). (11.6) 


e A pair (p4,u), where Pa is a density operator on B(H,) and u is a unitary on 
(s) 


F — 


Hs ® Hy, such that for each i € I and each unit vector vf) € Ay, (i.e., 4; V 


v!), the state ue is) ® pa)u* (i.e. the outcome of the measurement) lies in Sj. 


In (11.6) the first bound (which for small 7) is + (2—1) <---) is the key one, as 
the last one < 2 is always satisfied and has been included for clarity. In particular, 


||; — @;|| > V2. (11.7) 


Note that (11.6) implies that the S; must be disjoint, since assuming @ € S; gives 
|| — || > 2/1 —7 for all @; € S;, whereas @ € S; allows one to take @; = @ 
in this inequality, leading to the contradiction 0 > 2,/1 — 77. Note that in terms of 
density operators we have 


||; — @;|| = ||P: — Pylli, (11.8) 


where @;(a) = Tr(pja), cf. (B.481) and Theorem B.146. If @ and @; are pure, 
induced by unit vectors y; and w; in Hs ® Ha, then by (C.637), eq. (11.6) comes 
down to 

0< (WW)? <n. (11.9) 
For example, in the von Neumann measurement scheme (11.1), the subspace S; just 
consist of the vector state defined by v") ® vw, hence (11.6) holds with n = 0. 


i 
Theorem 11.4. For any nontrivial object observable s and partitioning of o(s), any 


sound measurement scheme ((S;),1,Pa,u) admits initial states v © Hs such that 
u(ey ® Pa )u* (i.e. the post-measurement state) does not lie in any outcome space Sj. 
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Proof, Let v = (v;+v;)/V2, where i 4 j and for the moment v; and v; are merely 
orthonormal vectors in Hs. For each i = 1,2 we then compute: 


elev @ pa)ut —u(er, ® pau |" = lev @ pa —ev, @pally 
= lleo — eu Ih” 
= ||@v — ||| 
= 24/1—|(v, vi)? 
= V2, (11.10) 
where || - || () denotes the trace norm relative to H. Now take v; = vi) as in Defini- 


tion 11.3. Since @; = ules) ® pa)u* € S; by definition of a sound measurement, it 


follows from (11.7) and ad 1.10) that @ = u(ey @ pPa)u* cannot lie in any subspace 
Sx, since that would require ||@ — @;|| > V2 for all 1 A k, whereas (11.10) shows 
that this inequality fails for at least two values of J, viz. J =i and/ = j Zi. 


In order to circumvent Theorems 11.2 and 11.4, one should deny at least one of 
their explicit premises. Moreover, we note that postulate no. 3 (i.e. linearity of time- 
evolution) is always implicitly used in the form of the following counterfactual: 


If W, were the initial state, then for each n it would evolve (linearly) according 
to the Schrodinger equation with given Hamiltonian h. /f the initial state were 
YnCnWn, also then it would evolve according to the same Hamiltonian h. 


This counterfactual should be added as a tacit assumption to all insolubility proofs 
(and also to informal statements of the measurement problem). As such, it may 
reasonably be denied (see §11.4), and such a denial puts assumption no. 4 in the 
problem of statistics in perspective, namely by denying the possibility that identical 
initial states can always be prepared in such a way that they evolve through exactly 
the same Hamiltonian. This leaves room for the following denials of some premise: 


1. The apparatus is not described quantum-mechanically; 

2. The wave-function of the system is not complete; 

3. The wave-function does not always evolve by the Schrédinger equation; 
=4. Identical initial wave-functions always yield identical outcomes; 

= 4’. Measurements do not have determinate outcomes. 


Current programs for solving the measurement problem neatly fall into this scheme: 


1. Copenhagen Interpretation and Swiss Approach; 

2.  Hidden-variable theories, most prominently Bohmian mechanics; 

-=3. Dynamical collapse theories (such as GRW); 

«4. Instability approaches, e.g., the Flea on Schrédinger’s Cat (which keeps 3); 
=4’. Many-Worlds Interpretation, i.e., Everettian quantum mechanics. 


Leaving most of these to the literature, we now turn to the instability approach (—4). 
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11.4 The Flea on Schrodinger’s Cat 


The conclusion of this lengthy historical and technical introduction is that there are 
(at least) two different formulations of the measurement problem, whose insolubility 
is expressed by Theorems 11.2 and 11.4, respectively (leaving apart lavish opportu- 
nities for disagreement about the precise formulation of the underlying assumptions, 
and not even speaking about the outright dismissal of the whole issue as a Schein- 
problem). Thus the problem in question is evidently of a different kind from say the 
famous open conjectures in mathematics (like the Riemann hypothesis), where it is 
clear what the theorem is that needs to be proved. Nonetheless, despite its undeni- 
able philosophical aspects, we see the measurement problem as a genuine physics 
problem concerned with the discrepancy between (quantum) theory and experiment, 
to be addressed by mathematical, physical, and philosophical analysis. 

Well aware that different people typically draw different lessons from history, 
we will now, in the interest of motivating our approach to follow, draw our own 
(necessarily subjective) conclusions from the history of the measurement problem. 


1. Though grounded in genius and tradition (Heisenberg, von Neumann, Wigner), 
the two-step way of looking at the measurement process (i.e. in terms of firstly 
a reduction of the wave-function by some non-unitary “process 1” and secondly 
a registration of a single outcome), with ensuing separation of the measurement 
problem into a “small” and a “big” problem, is fruitless and should be abandoned. 
It has no basis whatsoever in experimental physics (where the alleged mixed 
post-measurement states are conspicuously absent), it reflects obsolete ensemble 
thinking, and it is unsound also theoretically, as shown both by the first kind of 
insolubility results (4 la von Neumann and Theorem 11.2), as well as by the fail- 
ure of programs addressing just the “small” problem (like the Swiss approach 
and Decoherence). These approaches are unable to deal with the “big” problem 
(except perhaps through desperate remedies like Many Worlds) and hence, even 
if they work, they deliver Pyrrhic victories at best. The problem of obtaining sin- 
gle outcomes should be solved directly, before it is too late. Since such a solution 
would leave nothing to interfere, the “small” problem automatically disappears. 
This does not mean that it is sufficient to obtain definite outcomes alone; among 
all remaining challenges, deriving the Born rule stands out in particular. 

2. Too much formal analysis has been done on the measurement problem (including 
the insolubility theorems just reviewed) without taking the special nature of mea- 
surement devices into account; alas, this negligence has its roots in the work of 
von Neumann. These devices are typically treated as ordinary quantum systems, 
as a consequence of which the notion of an “outcome” has to be defined within 
quantum mechanics and hence has to be identified e.g. with an eigenstate of some 
operator describing the apparatus (as in Theorem 11.2) or with some subspace of 
the quantum-mechanical state space (as in Theorem 11.4). Such identifications 
are purely formal and have little basis in experimental physics: as long as one 
defines outcomes of measurements within quantum mechanics, there is no mea- 
surement problem (but at worst some unease concerning value indefiniteness)! 
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Fig. 11.1 The waves crashed between the towering cliff of Scylla and the jagged rocks of Charyb- 
dis. Colour litograph by Gino D’ Antonio. Reprinted with permission from Look and Learn Ltd. 


On the other hand, both the Copenhagen Interpretation and the Swiss approach 
seem to have gone too far in the opposite direction: the former because it simply 
assumed (without providing any justification) that measurements have outcomes 
as soon as the apparatus is described classically, the latter in treating apparatuses 
as strictly infinite, and hence falling victim to Earman’s Principle. The right ap- 
proach, then, must be to define measurement as in the Copenhagen Interpreta- 
tion, i.e. using a classical description of the apparatus whilst realizing it is on- 
tologically a quantum system, and thusly navigate between Scylla (who treats 
measurement devices as arbitrary guantum systems) and Charybdis (who is too 
enthusiastic in taking infinite limits and hence in using a classical description). 
3. Some kind of reality has to be attributed to the state of the system (though this 
reality cannot be “absolute”, as in classical physics). In the algebraic approach 
to quantum theory adopted throughout the present book, the starting point is pro- 
vided by the observables, relative to which states are defined. Since the doctrine 
of classical concepts drives us to switch between quantum-mechanical and clas- 
sical descriptions, the reality of the quantum state is therefore perspectival. How- 
ever, their perspectival nature does not make states less real; they say everything 
there is to say (at least by quantum theory) about some given level of description 
(which may be said to be chosen by the observer, and hence is intersubjective). 


Thus the measurement problem arises in the way Schr6dinger (rather than von Neu- 
mann) described it, although a precise framework has to be added to his poetry. 

A framework that is precise both conceptually and mathematically is offered by 
asymptotic emergence, which we already encountered in our discussion of SSB in 
the previous chapter (see especially its preamble). To repeat the main points, we 
speak of asymptotic emergence if the following three conditions are all satisfied: 
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1. A “higher-level theory” H (which in the context of the measurement problem is 
either classical mechanics or classical thermodynamics, depending on the mea- 
surement setup) is a limiting case of some “lower-level theory” L (viz. quantum 
mechanics, including quantum statistical mechanics of a finite system). 

2. Theory H is well defined and understood by itself (typically predating L). 

3. Theory H has “emergent” features that cannot be explained by L, e.g. because L 
does not have any property inducing those feature(s) in the limit pertinent to H. 


The root of the measurement problem (and hence the relevance of asymptotic emer- 
gence), then, lies in Bohr’s requirement that the outcomes of measurements on sys- 
tems defined within L be recorded in (at the least the language of) H, so that, cru- 
cially, measurement according to L is a notion external to L (if only partly), in par- 
ticular involving the relationship between L and H. None of the insolubility proofs 
of the measurement problem take this into account (although due to Butterfield’s 
Principle these proofs remain relevant in a secondary way). The typical feature of 
H that would be emergent in the above sense if the measurement problem were un- 
resolved is that every physical system subject to the theory H is ontologically in 
a pure state; in Schrédinger’s words quoted in §11.1: in H, sharply focused pho- 
tographs of states are always possible (and hence any uncertainty or chance is due 
to ignorance, as in classical physics). Now, whatever the ontological nature of states 
in L, the states they induce in H should be real in the above sense, i.e., pure. But 
this is precisely what does not seem to be the case in typical measurement situations 
(e.g., Schrédinger’s Cat), where the post-measurement state on L induces a mixed 
state on H. Just as in the case of SSB, this violates Butterfield’s Principle, which in 
the case at hand states that since H is an idealization of L, any physical effect in H 
must be foreshadowed in L: as L approaches H, sharp measurement outcomes (de- 
fined as pure states in H) must arise from at least approximate single measurement 
outcomes (i.e. “singly-peaked wave-functions”’) in the relevant asymptotic regime 
of L (since only these wave-functions gives rise to pure classical states on H). 

As noted before in the setting of SSB: violating Butterfield’s Principle means 
violating Earman’s Principle, which in turn leads to a violation of the link between 
theory and reality. It is worth spelling this out for the measurement problem: 


e Reality is described by quantum mechanics (even in the Copenhagen Interpreta- 
tion, classical mechanics is an idealization of quantum mechanics); 

e Real phenomena—in this case, sharp measurement outcomes— are correctly de- 
scribed by classical mechanics although this is an idealization; 

e Quantum mechanics (allegedly) cannot possibly induce these phenomena in its 
limit towards classical mechanics although it is the theory that should apply; 

e Hence quantum mechanics contradicts reality. Classical mechanics does not con- 
tradict the reality of sharp measurement outcomes, but it is not the appropriate 
theory to explain them; this explanation should come from quantum mechanics. 


It may now seem that invoking Butterfield’s Principle has reduced the measure- 
ment problem to the usual one(s) described in the preceding sections. But look at 
the small print: in the Copenhagen Interpretation, single measurement outcomes 
only appear in some limiting “classical” regime of quantum mechanics. 
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“Deep inside” quantum mechanics, there is no need at all for the typical superpo- 
sition )’,, Cn WY, to collapse into one of the states y,, (unless one conflates the physical 
measurement problem with the philosophical problem of value indefiniteness). The 
external and asymptotic nature of measurement outcomes causes the measurement 
problem, but, as we shall see, at the same time it provides the key for its solution, 
since the collapse mechanism we propose is only effective asymptotically (so that it 
operates where it should and does not act where it should not). More precisely, by 
taking into account perturbations of the Hamiltonian that are tiny and ineffective in 
the quantum regime, but become hugely destabilizing in the classical regime (even 
before the actual limit), the wave-function of the apparatus will collapse. 

Summarizing the preceding discussion, “our” measurement problem states that: 


e Certain pure post-measurement states of an (ontologically quantum-mechanical!) 
apparatus coupled to a microscopic quantum object induce mixed states on the 
apparatus (and on the composite) once the apparatus is described classically. 


This is a precise version of Schrédinger’s Cat problem (rather than von Neumann’s 
purely quantum-mechanical measurement problem), making it clear that at heart the 
problem does not lie with the (dis)appearance of interference terms (which is a red 
herring) but with the inability of quantum mechanics to predict single outcomes. 


We now show by means of a simple example what it means to describe an on- 
tologically quantum-mechanical apparatus classically, and outline the scenario we 
envisage for the solution of the measurement problem on the basis of this example. 
The Spehner—Haake model of the apparatus described below is too simple to be 
realistic, but nonetheless it may serve its purpose (as Bohr would say). The model 
involves a double-well potential like (10.11), modified however by a little basin in 
the middle, as shown below (including ground states for one large and one small 
value of i). Also here, SSB will play a crucial role, so please recall §10.1. 


Fig. 11.2 Double-well potential with basin; ground state Vins and Wo or 
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Consider N’ = N +1 non-interacting particles, each with mass m, moving on 
the real line under the influence of a one-particle potential V (note that although 
the zero’ th particle with be handled lightly differently from the others, it is not the 
pointer!). In terms of the canonical coordinates (p’,q’) = (po,---;PN;90.---;9N) € 
R2" on the phase space X = T*R” ' the classical Hamiltonian is 


N pe 
h(p’,q’) = (ti Vla)) (11.11) 


n'=0 


Now perform a canonical transformation to center of mass and relative coordinates 


N N 
1 
P= py O=— Yaw (11.12) 
n’'=0 


n'=0 


oe 1 
fin = VN'Pa— Pa! Pn = Fai (4n— 90) (n=1,...,N); (11.13) 
n'=0 


the center of mass (P,Q) will be the pointer. The inverse transformation is given by 


ee a 11.14 
BS gg (11.14) 
P 1 
Pn = wt qn (11.15) 
1 N 

= Q-— iat 11.16 

qo =O uw LP (11.16) 
VN as 

gn = O+ VN'Pn— —= )_ Pk- (11.17) 
Jw 2 


Granted that {Py ,qu } = Ove, {Pn's Pe} = 9, and {4n’,qx } = 0, we then duly have 
{P,Q} =1 and {2,, p,} = 5,,x, with all other elementary Poisson brackets vanishing. 
In terms of the new coordinates, the classical Hamiltonian (11.11) reads 


h(P,Q,%,p) =ha(P,Q) +haz(Q,p) +he(z), (11.18) 
where 7 = (71,...,%n), P = (P1,---,Pn), and the three partial Hamiltonians are 
Pe 
ha(P,Q) = nar +N'V(Q); (11.19) 


N 2 
he(®) = > bn “Gr | (11.20) 


haz (Q,p) =) a file) (Q), (11.21) 
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where M = Nm is the total mass of the system, for simplicity we assumed V to be 
analytic (it will even be taken to be polynomial), and we abbreviated 


oe ee ae Jw iw \ 
fx(P) = | -= pr | + N'~n—-—=Y pi} - (11.22) 
Note that f|() = 0, so that to lowest order (i.e. k = 2) we have 


N 
haz (Q,p) (3 nY p2- ya) v"(Q)+--- (11.23) 


n=l kAl 


We pass to the corresponding quantum-mechanical Hamiltonians in the usual way, 
and couple a two-level quantum system to the apparatus through the Hamiltonian 


hsa = L-03@P, (11.24) 


where the object observable s = 03, acting on Hs = C’, is to be measured. The idea 
is that h,4 is the Hamiltonian of a pointer that registers outcomes by localization on 
the real line, hg is the (free) Hamiltonian of the “environment”, realized as the in- 
ternal degrees of the freedom of the total apparatus that are not used in recording 
the outcome of the measurement, and h4r describes the pointer-environment inter- 
action. The classical description of the apparatus then involves two approximations: 


e Ignoring all degrees of freedom except those of A, which classically are (P,Q); 
e Taking the classical limit of hg, here realized as N — © (in lieu of h — 0). 


The measurement of s is now expected to unfold according to the following scenario: 


1. The apparatus is initially in a metastable state (this is a very common assump- 
tion), whose wave-function is e.g. a Gaussian centered at the origin. 

2. If the object state is “spin up”, ie., Ws = (1,0), then it kicks the pointer to the 
right, where it comes to a standstill at the bottom of the double well. If spin is 
down, likewise to the left. If ys = (1, 1) /V2, the pointer moves to a superposition 
of these, which is close to the ground state of V displayed in Figure 11.2. 

3. In the last case, the Flea mechanism of §10.2 comes into play: tiny asymmetric 
perturbations irrelevant for small N localize the ground state as N + ©. 

4. Mere localization of the ground state of the perturbed (apparatus) Hamiltonian in 
the classical regime is not enough: there should be a dynamical transition from 
the ground state of the original (unperturbed) Hamiltonian (which has become 
metastable upon perturbation) to the ground state of the perturbed one. This dy- 
namical mechanism in question should also recover the Born rule. 


Thus the classical description of the apparatus is at the same time the root of the 
measurement problem and the key to its solution: it creates the problem because at 
first sight a Schrédinger Cat state has the wrong classical limit (namely a mixture), 
but it also solves it, because precisely in the classical limit Cat states are destabilized 
even by the tiniest (asymmetric) perturbations and collapse to the “right” states. 


456 11 The measurement problem 


The “flea” perturbation might itself be a genuine random process, perhaps ulti- 
mately of quantum origin. In that case, the measurement merely amplifies the ran- 
domness that was already inherent in the flea by transferring it to the apparatus. 

Alternatively, the flea might be fundamentally deterministic (though it may 
nonetheless be modeled stochastically for pragmatic reasons). In principle, this 
would open the door to a restoration of determinism: for the flea now transfers its 
determinism (rather than its randomness) to the apparatus. The mistaken impression 
that quantum theory implies the irreducible randomness of nature then arises be- 
cause although measurement outcomes are determined, they are unpredictable “for 
all practical purposes’, even in a way that (because of the exponential sensitivity to 
the flea in 1 /% or N) dwarfs the unpredictability of classical chaotic systems. 

Either way, the flea perturbation would naturally be different at each different run 
of an experiment under otherwise identical initial conditions, which motivates our 
critique of the counterfactual discussed after the proof of Theorem 11.4. 

The location of the flea plays a similar role to the position variable in Bohmian 
mechanics, i.e., it is essentially a hidden variable. Recall the notions of Outcome 
Independence (01) and Parameter Independence (P1), reviewed in 86.5. Briefly, the 
conjunction of OI and PI is equivalent to Bell’s locality condition, and if the latter 
is satisfied, then the Bell inequalities hold. Since these are violated by quantum 
mechanics, any hidden variable theory compatible with quantum mechanics must 
violate OI or PI. Deterministic hidden variable theories necessarily satisfy OI, in 
which case Bell’s Theorem or the Free Will Theorem shows that they must violate 
PI in order to be compatible with quantum mechanics. A violation of PI leads to 
possible superluminal signaling only if the hidden variable z can be controlled. If 
the wave-function y is regarded as the hidden variable, then quantum theory itself 
satisfies PI but violates OI (since y can be prepared, the other way round would be 
disastrous). Qua deterministic hidden variable theory, Bohmian mechanics satisfies 
OI, and hence it violates PI; for the GRW collpase theory it is the other way round. 

The fate of the flea therefore depends on the nature of the perturbation: if it is 
deterministic, the theory behaves like Bohmian mechanics in this respect and hence 
violates PI, whereas stochastic perturbations typically violate OI (and possibly also 
PI). Either way, no conflict with the said theorems arises. Moreover, in the Colbeck— 
Renner Theorem, assumption CP fails for the flea scenario—assuming, in view of 
its limitation to finite-dimensional Hilbert spaces, the theorem is applicable at all! 

Besides such issues, others remain to be resolved, of which we just mention two: 


1. Collapse of the wave-function has become a tunneling process, whose static ef- 
fects are exponentially enhanced as N — (or A — 0, as in §10.2). However, 
tunneling times increase in the same way, so that the environment is needed not 
only to provide the perturbation, but also to speed up the dynamics of collapse. 

2. The flea not only destabilizes the Schrédinger Cat state (as desired), but also 
destabilizes the intended outcome states (like those in S;, cf. Theorem 11.4). Also 
here the environment should play a decisive role in (re)stabilizing the latter but 
not the former, possibly through the mechanism of einselection, cf. §11.2. 
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Notes 


$11.1. The rise of orthodoxy 

The literature on the measurement problem is vast. Apart from the annotated 
reprint volume Wheeler & Zurek (1983), relatively recent surveys of and books 
include Bell (1990b), Maudlin (1995), Busch, Lahti, & Mittelstaedt (1996), Bassi 
& Ghirardi (2003), Mittelstaedt (2004), Wallace (2012), Allahverdyan, Balian, & 
Nieuwenhuizen (2013), and Busch, Lahti, Pellonpaéa, & Ylinen, (2016). In modal 
interpretations of quantum mechanics, the measurement problem is (dubiously) con- 
flated with the far milder problem of value indefiniteness, see e.g. Bub (1997). 


§11.2. The rise of modernity: Swiss approach and Decoherence 

The Swiss approach to the measurement problem was initiated by Jauch (1964), 
to be continued by e.g. Hepp (1972), Emch & Whitten-Wolfe (1976), and recently 
also by Hepp’s former student Frohlich; see e.g. Frohlich & Schubnel (2013) and 
Blanchard, Frohlich & Schubnel (2016). In addition, see Landsman (1991, 1995)— 
now seen as naive—, Breuer, Amann & Landsman (1993), and Sewell (2005). 

Key early papers on decoherence were Zeh (1970), Zurek (1981), and Joos & Zeh 
(1985), and standard reviews are Zurek (2003), Joos et al (2003), and Schlosshauer 
(2007). Penetrating critiques include Janssen (2008) and Tanona (2013). See also 
Camilleri (2009a) and Freire (2009) for some history. 

A defence of QBism may be found in Caves, Fuchs, & Schack (2002b). 


§11.3. Insolubility theorems 

Insolubility theorems of the first kind kind go back to von Neumann (1932) and, 
in his wake, Wigner (1963) and Fine (1970). Theorem 11.2 is (in even more general 
form) due to Busch & Shimony (1996); with slightly different assumptions, the spe- 
cial case proved in the main text is due to Brown (1986). The monographs by Busch, 
Lahti, & Mittelstaedt (1996) and Mittelstaedt (2004) contain detailed discussions of 
theorems of this kind. See also Bacciagaluppi (2014). 

The formulation of the problem of statistics and the problem of outcomes is taken 
from Maudlin (1995). Theorem 11.4 is due to Bassi & Ghirardi (2003), although 
here it is presented in a form inspired by Griibl (2003). 

For Bohmian mechanics see e.g. Goldstein (2013) and Bricmont (2016). A recent 
review of the GRW program and related dynamical collapse theories is Bassi et al 
(2013). Nowadays, the locus classicus for Many Worlds is Wallace (2012). 

The time-evolution counterfactual discussed in the main text was inspired by the 
problem of free will, see the quotation of Dennett at the beginning of 86.3. 


$11.4. The Flea on Schrédinger’s Cat 

The approach to the measurement problem discussed here has its roots in Lands- 
man & Reuvers (2013) and Landsman (2013), whose model at the time only in- 
volved the apparatus. This was criticized in van Heugten & Wolters (2016), many 
of whose points may be addressed by turning to the Spehner—Haake model, in- 
troduced by Spehner & Haake (2008). The ABN-model of Allahverdyan, Balian, & 
Nieuwenhuizen (2013) gives a similar picture; for a comparison see Spehner (2009). 


Chapter 12 
Topos theory and quantum logic 


The topos-theoretic approach to quantum mechanics (also known as quantum 
toposophy) has the same origin as the quantum logic programme initiated by 
Birkhoff and von Neumann, namely the feeling that classical logic is inappropri- 
ate for quantum theory and needs to be replaced by something else. For example, 
Schrédinger’s Cat serves as an “intuition pump” for this feeling (at least in the naive 
view—dispensed with in Chapter 1 1—that it is neither alive nor dead). However, 
we feel that the quantum logic proposed by Birkhoff and von Neumann is: 


e too radical in giving up distributivity (rendering it problematic to interpret the 
logical operations /\ and V as conjunction and disjunction, respectively); 

e not radical enough in keeping the law of excluded middle, which is precisely 
what intuition pumps like Schrédinger’s cat and the like challenge. 


Thus it would be preferable to have a quantum logic with exactly the opposite fea- 
tures, i.e., one that is distributive but drops the law of excluded middle: this suggest 
the use of intuitionistic logic. It is interesting to note that Birkhoff and von Neumann 
(who had earlier corresponded with Brouwer about possible intuitionistic aspects of 
game theory, notably chess) actually considered intuitionistic logic, but rejected it: 


‘The models for propositional calculi which have been considered in the preceding sections 
are also interesting from the standpoint of pure logic. Their nature is determined by quasi- 
physical and technical reasoning, different from the introspective and philosophical consid- 
erations which have had to guide logicians hitherto. Hence it is interesting to compare the 
modifications which they introduce into Boolean algebra, with those which logicians on “‘in- 
tuitionist” and related grounds have tried introducing. The main difference seems to be that 
whereas logicians have usually assumed that properties L71—-L73 [i.e. (a’)’ =a, ana’ =, 
aUa' =T, anda C b implies a’ 5 b’] of negation were the ones least able to withstand a 
critical analysis, the study of mechanics points to the distributive identitiesas the weakest 
link in the algebra of logic. (...) Our conclusion agrees perhaps more with those critiques 
of logic, which find most objectionable the assumption that a’ Ub = T implies a C b (or, 
dually, the assumption that ab’ = | implies b > a—the assumption that to deduce an 
absurdity from the conjunction of a and not b, justifies one in inferring that a implies b).’ 
(Birkhoff & von Neumann, 1936, p. 837). 


As already made clear, then, our view is exactly the opposite. It is perhaps more 
striking that our position on (quantum) logic also differs from Bohr’s: 
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‘All departures from common language and ordinary logic are entirely avoided by reserving 
the word “phenomenon” solely for reference to unambiguously communicable information, 
in the account of which the word “measurement” is used in its plain meaning of standardized 
comparison.’ (Bohr, 1996, p. 393) 


Rather than postulate the logical structure of quantum mechanics, our goal is to 
derive it from our Bohrification ideology, more specifically, from the poset @(A) 
of all unital commutative C*-subalgebras of a unital C*-algebra A, ordered by in- 
clusion. One may think of this poset as a mathematical home for Bohr’s notion of 
Complementarity, in that each C € @(A) represents some classical or experimental 
context, which has been decoupled from the others, except for the inclusion rela- 
tions, which relate compatible experiments (in general there seem to be no preferred 
pairs of complementary subalgebras C,C’ € @(A) that jointly generate A, although 
Bohr typically seems to have had such pairs in mind, e.g. position and momentum). 

Quantum toposophy also accommodates the feeling that quantum mechanics is 
so radical that not just the actors of classical mechanics, but its whole stage must be 
replaced. This need is well expressed by the following quotation from Grothendieck, 
who created topos theory (but never witnessed its application to quantum theory): 


‘Passer de la mécanique de Newton 4 celle d’ Einstein doit étre un peu, pour le mathématicien, 
comme de passer du bon vieux dialecte provengal a |’argot parisien dernier cri. Par contre, 
passer a la mécanique quantique, j’imagine, c’est passer du frangais au chinois.’ 
(Grothendieck, 1986, p. 61).! 


Indeed, topos theory replaces even set theory, seen as the stage of classical math- 
ematics and physics, by some other stage: each topos provides a “universe of dis- 
course” in which to do mathematics. One major difference with set theory, then, is 
that logic in most toposes (including the ones we will use) is ... intuitionistic! 

This chapter presupposes familiarity with §C.11 on the logical side of the Gelfand 
isomorphism for commutative C*-algebras, Appendix D on lattice theory and logic, 
and Appendix E on topos theory. Since this material is off the beaten track, as in 
Chapter 6 it may be helpful to provide a very brief guided tour through this chapter. 

In §12.1 we first define the “quantum mechanical” topos T(A) that will act as 
the mathematical stage for the remainder of the chapter; it depends some given 
(unital) C*-algebra A only via the poset @(A). We then define C*-algebras inter- 
nal to any topos T (in which the natural numbers and hence the rationals can be 
defined), which notion we then apply to T = T(A), so as to define an internal C*- 
algebra A, which turns out to be commutative. Following an interlude on construc- 
tive Gelfand spectra in §12.2, in §12.3 we then compute the internal Gelfand spec- 
trum of A for A = M,,(C), and derive our intuitionistic logic of quantum mechanics 
from this, given by eqs. (12.95) - (12.96) and (12.103) - (12.107). We also discuss its 
(Kripke) semantics. In §12.4 we generalize these computations to arbitrary (unital) 
C*-algebras A, culminating in Corollary 12.22. Finally, in §12.5 we relate this mate- 
rial to both the Kochen—Specker Theorem (which provided the original motivation 
for quantum toposophy), as well as to an attempt at ontology called “Daseinisation.” 


' or a mathematician, switching from Newton’s mechanics to Einstein’s must to some extent 
be like switching from a good old provincial dialect to Paris slang. In contrast, I imagine that 
switching to quantum mechanics amounts to switching to Chinese.’ Translation by the author. 
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12.1 C*-algebras in a topos 


Let A be a unital C*-algebra (in Sets), with associated poset @ (A) of all unital com- 
mutative C*-subalgebras C C A ordered by inclusion. Regarding @ (A) as a (posetal) 
category, in which there is a unique arrow C — D iff C C D and there are no other 
arrows, we obtain the topos T(A) of functors F : @(A) — Sets (F underlined!), i-e., 


T(A) = [@(A), Sets]. (12.1) 


Since for any poset X we have an isomorphism of categories [X ,Sets] ~ Sh(X), 
where X is endowed with the Alexandrov topology, see (E.84), we may alternatively 
write 


T(A) ~ Sh(@(A)). (12.2) 


This alternative description will turn out to be very useful in computing the Gelfand 
spectrum of the internal commutative C*-algebra A to be defined shortly. Since we 
occasionally switch between T(A) and the topos Sets, we underline objects (i.e., 
functors F : @(A) — Sets) of the former. In order to do some kind of Analysis in 
T(A), we need real numbers. In many toposes this is a tricky concept, but: 


Proposition 12.1. In T(A), the Dedekind reals are given by the constant functor 
Ry: COR, (12.3) 
where C € @(A), with associated frame given by the functor 
O(R)o: C4 O((tC) x R). (12.4) 
Similarly, we have complex numbers C and their frame @(C) in T(A). 


Proof. In a general sheaf topos Sh(X), the Dedekind real numbers object is the 
sheaf (E.150), with frame (E.149). The point now is that each continuous function 
f © C(@(A),R) on X = @(A) with the Alexandrov topology is locally constant. 

To see this, suppose C < D in U, and take V C R open with f(C) € V. Then 
Ceéf—!(V) and f-!(V) is open by continuity of f. But the smallest open set con- 
taining C is }C, which contains D, so that f(D) € V. Taking V = (f(C) — €,-) 
gives the inequality f(D) > f(C) —e for all € > 0, whence f(D) > f(C), whereas 
V = (-0, f(C) +€) yields f(D) < f(C). Hence f(C) = f(D). 

Thus we obtain (12.3) - (12.4) as special cases of (E.150) - (E.149). 


Other objects of interest in T(A) that we will steadily use are: 


e The terminal object 1, i.c., the constant functor C +> *, where x is a singleton. 
e The truth object Q, which according to (E.86) - (E.87) is given by 


Qo(C) = Upper(C); (12.5) 
Q2,(C CD) = (-)N(tD), (12.6) 
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where Upper(C) is the set of all upper sets above C (i.e., S € Upper(C) iff Sc 
@ (A) such that: (i) C C D for each D € S, and (ii) DE Sand D CE imply E € S). 

e The subobject classifier t : 1 —> Q, which is a natural transformation whose com- 
ponents fc are given, according to (E.88), as 


tc(#) =TC. (12.7) 


i.e., the set of all DD C in @(A); this is the maximal element of Upper(C). 


Furthermore, exponentials in T(A) have the following straightforward description: 
F5(C) =Nat(Gyc,F yc) (C € @(A)), (12.8) 


where F'+¢ is the restriction of the functor F : @(A) — Sets to tC C @(A), and 
Nat(—,—) denotes the set of natural transformations between the functors in ques- 
tion. In particular, since C- 1 is the bottom element of the poset @(A), one has 


F&(C-1) = Nat(G,F). (12.9) 
One way to derive (12.8) is to start from general sheaf toposes Sh(X), where 
Fy’ (U) = Nat(Giy, Fi), (12.10) 


both restricted to G(U) (i.e. defined on each open V C U instead of all V € @(X)), 
and use (E.84). Combining these observations, one has 


QF(C) = Sub(F yc), (12.11) 
i.e., the set of subfunctors of F. +C: In particular, like in (12.9), we find 
QF(C-1) = Hom(F,) & Sub(F), (12.12) 


the set of subfunctors of F itself. Recall that, as explained after Lemma E.16, a 
subfunctor Z € Sub(F) is a functor Z: t@(A) > Sets for which Zy(C) C Fo(C) for 
all C € @(A) and Z, is the restriction of F,. If C C D, then the set-theoretic map 
QF(C) > QF(D) defined by QF, identified with a map Sub(F 4c) > Sub(F+p), is 
simply given by restricting a given subfunctor of Fy¢ to TD. 


Using either the internal language of a topos (see 8E.5) or direct object-arrow 
constructions, one can copy standard definitions in set theory so as to define math- 
ematical objects “internal” to any given topos, as long as these definitions make 
sense in first-order intuitionistic logic (which roughly speaking means that they are 
“constructive”, in not using the axiom of choice or the law of the excluded middle). 

As a case in point, let us now define internal C*-algebras in T(A) (this may 
be done even more generally in any topos T in which at least the natural numbers 
N, and hence the rationals Q, are defined). Vector spaces (over R or C) and (com- 
mutative) *-algebras may be defined in T(A) through straightforward object-arrow 
translations of the usual constructions in Sets, i.e., one has an object A and arrows: 
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-:CxA-—A (scalar multiplication); (12.13) 
+:AxA-A (addition); (12.14) 
x:AxA—A (multiplication); (12.15) 

*:A—A (involution), (12.16) 


subject to the usual axioms. Syntactically, a unit (internal) in A is a constant 
ly : 1 > A, 


with 1 the terminal object in T(A), such that 


ui 1,,id id 
(4Ss1xa"*" axa) =(4“Sa). (12.17) 


The notions of norm and completeness are less easily defined internally, and 
hence one starts reinterpreting the notion of a seminorm in Sets as a subset 


NCAxQ?*, (12.18) 


for which 
(a,q) EN iff |lal| <q. (12.19) 
In our topos T(A), we interpret N C A x Qt as a subfunctor N > A x Qt (or, equiv- 


alently by A-conversion (E.153), as an arrow 1 > Qa"), subject to the axioms: 


Vpp >0- (0,p) EN; (12.20) 
dgq > 0A (a,q) EN; (12.21) 
VaVp (a,p) € N — (a*,p) EN; (12.22) 
Vag ((a,g) EN © App <q (a,p) EN); (12.23) 
VaVp ((a,p) ENA (b,q) EN > (a+b,p+q) EN); (12.24) 
VaVp ((a,p) EN A(b,q) EN — (a-b,p-qg) EN); (12.25) 
VaVpVz((a,p) ENA (\z| <q) > (z-4,p-q) EN). (12.26) 


Here a,b are variables of type A, p and gq are variables of type Q, z is a variable 
of type C, 0 is the zero constant in A, etc. For a unital *-algebra (whose internal 
definition we leave to the reader), with unit denoted by 1,4 as usual, we also require 


lk VaVpp > 1— (la, p) EN. (12.27) 
If the seminorm relation furthermore satisfies 
(a*-a,¢°) EN (a,q) EN (12.28) 


for all a € A and q € Q’, then A is said to be a pre-semi-C*-algebra. 
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To proceed to a C*-algebra, one requires a = 0 whenever (a,q) € N for all q in 
Qt, making the seminorm into a norm, and subsequently this normed space should 
be complete. The latter condition is quite complicated, since in a topos one has no 
Cauchy sequences in the usual sense, because A may not have global elements (in 
the sense of arrows 1 — A). Indeed, our algebra A defined below only has trivial 
global elements, namely multiples of the the unit operator. 

Hence one needs a generalization of Cauchy sequences in the general spirit of 
topos theory, where global elements are replaced by general elements. 


Definition 12.2. With N the natural numbers object in T(A) (which is simply the 
constant functor C ++ N), a Cauchy approximation in A is an arrow s: N > Q4 
(or, equivalently, by A-conversion (E.153), an arrow 7: N x A > Q, which in turn 
is the same as a subobject S of N x A) such that: 


Vanda a E Sp} (12.29) 
VidnVnVa (n > m,n! > m,a € Sn,a’ € Sy) > (a—a’,1/k) EN. (12.30) 


Here (for brevity) the first three comma’s (but not the last!) stand for \, and a € Sy 
denotes (n,a) € S, where § is the above subobject of N x A classified by x (we use 
the notation explained in item 9 at the end of §E.5, where the variable x : X is now 
the pair (n,a) of type N x A). Moreover, a Cauchy approximation converges to b if: 


VednVn (n > m,a € S,) > (a—b,1/k) EN, (12.31) 


and we call A complete if each Cauchy approximation in A converges. 
Finally, a C*-algebra in T(A) (and similarly in any topos with natural numbers) 
is a complete pre-semi-C*-algebra in which the semi-norm is a norm. 


Homomorphisms and isomorphisms between such (internal) C*-algebras may be 
defined in the usual way, bijections in set theory being replaced by isomorphisms 
of objects. We only consider internal C*-algebras with unit, so that we may define 
internal categories CA, (and CCA,) of (commutative) unital C*-algebras in T(A) in 
the obvious way (where the homomorphisms are required to preserve the unit). 


We now come to the basic construction that underlies “quantum toposophy”’. 


Theorem 12.3. Let A be a unital C*-algebra. Define a functor A € T(A) by 


A: @(A) > Sets; (12.32) 
Ao(C) =C; (12.33) 
A\(C CD) =(C SD). (12.34) 


Then A is an internal unital commutative C*-algebra under pointwise operations. 


Here A is meant to be an “ordinary” unital C*-algebra, i.e., defined in Sets. Note that 
the symbol C in (12.33) changes character from left to right: on the left-hand side it 
is a point in @(A), whereas on the right-hand side it is a subset of A. Nonetheless, 
one might describe A as the tautological functor in [@ (A), Sets]. 
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The pointwise operations in A are the obvious natural transformations that are 
ultimately defined by the corresponding operations in each commutative C*-algebra 
C. For exampe, addition +: A x A > A is a natural transformation with components 
+c:CxC- C defined in C, etc. Commutativity of A then trivially follows from 
commutativity of each commutative C*-subalgebra C. 

As already mentioned, the unit 1,4 is syntactically a constant 14 : 1 — A, whose 
components (14)c : * + C are just the units 1¢ in each C (recall that elements of 
our poset @(A) were defined as unital commutative C*-subalgebras of A!). 

Finally, we regard the (semi) norm WN as a subobject of A x Rt (or A x Q*), 
hence as a natural transformation, with components Nc C C x R? defined by 7 


(c,q) € Ne iff |lel| <4, (12.35) 


where || - || is the norm in C (which of course is inherited from A). 


Proof. The proof is a straightforward verification, expect perhaps for completeness. 
First, the above subobject S of N x A, realized as a subfunctor as usual, looks as 
follows: for each C € @(A) we have a subset S- C Nx C, regarded as a sequence 
(C,) of subsets of C through the identification (n,c) € Sc iff c € Cy, such that C, C 
Dy, whenever C C D. Unfolding axiom (12.29) using the Kripke—Joyal semantics 
tules listed at the end of 8E.5, we find that this axiom holds iff: 


Voee(a) VneN deec Voac¢ € Dn, (12.36) 


which is satisfied iff each of the above subsets C,, C C is non-empty. By a similar 
analysis, axiom (12.30) is satisfied iff for each € > 0 there is m € N such that for all 
n,n, > mand all c € Cy, c’ € C, one has ||c —c’|| < € in C. This simply means that 
any choice (cy) where c, € C, isa Cauchy sequence in C. Accordingly, A is complete 
provided each such sequence converges, i.e., iff each C € @(A) is complete. Since 
these C’s are C*-subalgebras of C, this is simply true by construction. 


In a similar way, one easily proves the following generalization of Theorem 12.3: 


Theorem 12.4. Let C be a small category. Any internal C*-algebra in the associated 
presheaf topos |C°?, Sets] is given by a contravariant functor A: C + CA, where 
CA is the category that has C*-algebras as objects and homomorphims as arrows. 
Moreover, A is unital/commutative iff each C*-algebra A(C) is unital/commutative. 


It should be mentioned that internal C*-algebras on sheaf toposes T = Sh(X) are not 
covered by this theorem (except in the somewhat degenerate case we use, namely 
X = @(A) with the Alexandrov topology). As a case in point, we just mention the 
beautiful fact that internal C*-algebras in Sh(X) correspond to continuous bundles 
of C*-algebras over X (in Sets). 
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12.2 The Gelfand spectrum in constructive mathematics 


In this chapter we rely on a particular construction of the frame @(Z(A)) (cf. §C.11) 
that can be generalized to topos theory (in which the Gelfand spectrum (A) of an 
internal commutative C*-algebra A is a locale). We start with some lattice lore. 


Definition 12.5. Let L be a distributive lattice with top T and bottom L. 


1. A lower set in L is a subset S C L such that if x € S and y < x, then y € S. We 
denote the poset of all lower subsets of L, ordered by inclusion, by D(L). 

2. An ideal in a lattice L is a lower set I in L such that x,y € I implies xV y € I. The 
poset all ideals in a lattice L, ordered by inclusion, is denoted by Idl(L). 

3. We say that x < y (in words: “x is well inside y” or “x is rather below y” ) iff 
there exists z such that x \z = L and yVz= 17. Note that x <y implies x < y, as 


x=xXA(yVz) =(xXAy) V(XAz) =xAy<y. (12.37) 


4. An ideal I € Idl(L) is regular if the condition I D {y € L| y < x} implies x € I. 
The poset of regular ideals in L, ordered by inclusion, is called RId\(L), i.e., 


RIdl(L) = {7 € Idl(L) | (Very Kx > ED SxET}. (12.38) 


The posets D(L), IdI(L) and RIdl(L) are easily seen to be frames. Any ideal J € 
Idl(L) can be regularized, i.e., turned into a regular ideal (1), by means of the 
restriction to Idl(L) C D(L) of the “closure” map & : D(L) — D(L) defined by 


B(1)={xEL|Vyery<Kx>yelf. (12.39) 
In terms of ./, the canonical map x++ |x from L to Idl(L) “regularizes” to a map 


f:L— Ridl(L); (12.40) 
xH A(\x). (12.41) 


For J € RIdl(L) we obviously have (I) = 7, and hence we may write 
RIdI(L) = {1 € Idl(L) | (1) = 1}. (12.42) 


Definition 12.6. 7. A frame G(X) with top element T is called compact if every 
subset S C O(X) with \VS = T has a finite subset F CS with \V F =T. 
2. A frame G(X) is called regular if each V € @(X) satisfies 


V=\{U € O(X) |U«V}. (12.43) 


When @(X) is the topology of some space X, the frame G(X) is compact (regular) 
iff X is compact (regular) as a space. Furthermore, X is compact and Hausdorff iff 
it is compact and regular, and hence the Gelfand spectrum 2 (A) of a commutative 
unital C*-algebra A will be a compact and regular frame; see Theorem 12.8 below. 
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Recall that the self-adjoint part As, of any C*-algebra A is partially ordered by 
putting a < biff b—a€ Ar, cf. §C.7. This partial order is, of course, inherited by 
the positive cone A* C Aga. If A is commutative, this partial ordering makes Aga a 
lattice; for example, if A = C(X) the lattice operations are a V b = max{a,b} and 
a/b = min{a,b} (taken pointwise). In general, one may then compute V and A 
from the Gelfand isomorphism A ~ C(X), but they are intrinsically defined via <. 

Let A be a commutative unital C*-algebra. For a,b € At, define a = b iff there 
exists n € N such that a < nb. Define a = b iff a = b and b = a. This is an equivalence 
relation. Moreover, ~¥ is a congruence, that is, an equivalence relation ~ on a lattice 
L that is compatible with A and V in the sense that x ~ y and x’ ~ y! imply x Ax’ ~ 
yAy' andxV x’ ~y\Vy’. Given some congruence ~ on L, one may define (A and V on 
L/ ~ by [x] A [y] = [xy] and [x] V [y] = [x Vy], respectively, so that the set-theoretic 
quotient L/ ~ inherits the lattice structure of L and hence is a lattice in its own right. 

This quotient construction by a congruence preserves distributivity, so that 


Lg Sar (Rr: (12.44) 


is a distributive lattice. We will use the elements D, = [at] of L4 (indexed by a € 
Aga), Where [a*] is the equivalence class in L, of the positive part a* in the canonical 
decomposition a = at — a”, with a* > 0 and a,a_ = 0; lattice-theoretically, one 
has a, =aVOanda_ =aA0. This gives a lattice homomorphism Ag, > L4, a> Da, 
whose restriction to At is just the canonical projection AT — L,4. These D, satisfy: 


D, =T; (12.45) 
DzAD_~=:; (12.46) 
D,=L (a<0); (12.47) 

Da+p < Da V Do; (12.48) 
DAD, Das (12.49) 
Ds < Davy; (12.50) 


where the inequalities may also be written as equalities, since x < y iff x =xAy. 
These relations are easy to check for A = C(X), and hence they are true for any A. 
The elements D, obviously exhaust A*, and eqs. (12.45) - (12.50) imply: 


a<b => D,<D;; (12.51) 
D,=D,+; (12.52) 

Dna =Dz (n EN); (12.53) 
Dap = (Da AD;) V (Da A D_p; (12.54) 

Da AD; =Darp- (12.55) 


For the Gelfand spectrum we need the frame RIdl(Z,), and hence the relation <. 
Lemma 12.7. For all Dg,D, € La, we have (with both q € Q* and q € R*): 


Dp < Da iff 3g>0 Dy < Da—g- (12.56) 


468 12 Topos theory and quantum logic 


Proof. From right to left, just choose D. = Dg_a. Conversely, if A = C(X), it is easy 
to see that if there exists D, € La such that D. VD, = T and D. AD, = _L, then there 
exists g > 0 such that D._, V Da—g = T. Hence D, V Dag = T,, so that 


Dp = Dy A (D-V Da—q) =D,A Da—q < Da—q: 
Note that by construction the map f in (12.40) is given by 
f(Da) = {De € La | Vo,er4 Dp K De = Dp < Deo}, (12.57) 


and, by Lemma 12.7, satisfies 


f(Da) < \V{F(Da-q) | 4 > OF. (12.58) 
For later use, also note that (12.57) implies 
f(Da) =T #Da=T. (12.59) 


Theorem 12.8. The topology G(2(A)) of the Gelfand spectrum (A) of a commu- 
tative unital C*-algebra A is isomorphic to the frame of all regular ideals of La: 


O(E(A)) & RIdl(La); (12.60) 
{@ € E(A) | a(a) > 0} + Da, (12.61) 


or, equivalently, for the opens (r,s) € @(R) with ensuing opens @|(r,s) in @(Z(A)), 
a@'(r,s) ={@ € Z(A)| (a) € (r,5)} & f(Ds-aADa_r) (r<s). (12.62) 


Moreover, on this isomorphism, @(Z(A)) is a compact regular frame. 


The proof of this theorem is unfortunately beyond our reach; instead, we now give 
an alternative descriptions of the frame RIdl(Z,4), which will be useful for computa- 
tional purposes in topos theory. This again requires some more background in lattice 
theory. Let (L, <) be a meet semilattice (i.e., a poset in which any pair of elements 
has an infimum; in most of our applications (L, <) is actually a distributive lattice). 


Definition 12.9. A covering relation on L is a relation dC L x A(L)—equivalently, 
a function L— Y(A(L))—written x <1 U when (x,U) € <, such that: 


1. Ifx €U thenx dU. 

2. Ifx dU andU <1V (i.e, y dV forall y €U) thenx <V. 

3. Ifx<U thenxAy <U. 

4. Ifx €U andx €V, thenx AU AV (where U AV = {xAy|x€U,y €V}). 


For example, if (L,<) = (@(X),C) one may take x dU iff x < VU, ie., iff U 
covers x. Also here we have a closure operation </ : D(L) + D(L), given by 


AU ={xEL|x dU}. (12.63) 
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This operation has the following properties: 


LU C MU; (12.64) 
UCAV > AUC AV; (12.65) 
AUNAV CoA (LUNIV). (12.66) 


The frame  (L, <1) generated by such a structure is then defined by 
F(L,<d) ={U €D(L)| GU =U} ={(U Ee AL) |xcUSxeEU}; (12.67) 


the second equality follows because firstly the property </U =U guarantees that 
U € D(L), and secondly one has «U = U iff x << U implies x € U. Defining 


U~vV iff UdVandV 4U, (12.68) 
an equivalent description of the frame -¥ (L, <j) that is occasionally useful is 
F(L,<) = A(L)/~. (12.69) 


Indeed, the map U ++ [U] from ¥(L, <1) (as defined in (12.67)) to A(L)/ ~ isa 
frame map with inverse [U] + &/U. The idea behind the isomorphism (12.69) is 
that the map / picks a unique representative in the equivalence class [U], namely 
@U. As in (12.40) - (12.71), also here we have a canonical map 


f:L—7 F(L,<); (12.70) 

xrH> A(\x), (12.71) 

which satisfies f(x) < V f(U) if x < U. In fact, f is universal with this property, in 
that any homomorphism g : L > Y of meet semilattices into a frame Y such that 


g(x) < Vg(U) whenever x <1 U has a factorisation g = go f for some unique frame 
map 9: F(L,C) > G. This may suggest the following result: 


Proposition 12.10. Suppose one has a frame ¥ and a meet semilattice L with a 
map f :L— ¥ of meet semilattices that generates # in the sense that for each 
U € F one has U =\/{f (x) |x © L, f(x) < U}. Define a cover relation <1 on L by 


x JU iff f(x) <\VV FU). (12.72) 
Then one has a frame isomorphism ¥ = F(L, <1). 
We now turn to maps between frames, from the point of view of coverings. 


Definition 12.11. Let (L, <1) and (M, <4) be meet semilattices with covering relation 
as above, and let f* : L— Y(M) be such that: 


1. f*(L) =; 
2. fPX)AL (y) 4 fi (xy); 
3.x <1U = f*(x) <4 f*(U) (where f*(U) = Uneu f(U))- 
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If L and M have top elements 1, and Ty, respectively, then the first condition 
may be replaced by f*(T1,) = Tm. Define two such maps f;, fx to be equivalent 


if fi (x) ~ f(x) (i.e, fi (x) 465 (x) and fF (x) <4 fi (x)) for all x € L. A continuous 
map f : (M, <4) — (L, <1) is an equivalence class of such maps f* :L— Y(M). 


Our main interest in continuous maps lies in the following result: 


Proposition 12.12. Each continuous map f : (M,<) — (L,<1) is equivalent to a 
frame map ¥ (f): F(L, <1) > F(M, 4), given by 


F(f):UH Af*(U). (12.73) 


We may now equip L, with the covering relation defined by (12.72), given 
(12.60) and the ensuing map (12.57). Consequently, by Proposition 12.10 one has 


O(L) = F (La, <1), (12.74) 
which yields the following expression for the constructive Gelfand spectrum: 
O(2) = {U €D(L4) |x IU Sx EU}. (12.75) 
This lattice becomes computable through a lemma that is crucial for what follows: 


Lemma 12.13. Jn any topos, the covering relation <1 on La defined by (12.72) with 
(12.60) and (12.57), is given by Dg <1U iff for all q > 0 there exists a (Kuratowski) 
finite Up GU such that Da—q < \V Uo. If U is directed, this means that there exists 
Dp € U such that Dg—g < Dp. 


Proof. The easy part is the “<=” direction: from (12.58) and the assumption we have 
f(Da) < V f(U) and hence Dg <1 U by definition of the covering relation. 

In the opposite direction, assume D, < U and take some g > 0. From (the proof 
of) Lemma 12.7, Dg V Dg-a = T, hence V f(U) V f(Dg—a) = T. Since O(Z) is 
compact, there is a finite Uj) C U for which \/ f(Uo) V f(Dg-a) = T, so that by 
(12.59) we have Dy V Dg-a = T, with Dp = V/ Uo. By (12.46) we have 


Diagn Dy g Sch (12.76) 


and hence 


Da—g = Da-gA T = Da—qA (Dp V Dg—a) — Da—qA Dp < Dy = \/ Uo. 


If A is finite-dimensional, Ly, is a finite lattice. In that case, since Da_, = Da for 
small enough gq, one simply has x < U iff x < \VU, and the conditionx dU > x EU 
in (12.75) holds iff U is a (principal) down set, i.e. U =|.x for some x € La (not the 
same x as the placeholder x in (12.75)). Hence for finite-dimensional A we obtain 


O(E(A)) SIdl(L4) = {x | x € Ly}. (12.77) 
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12.3 Internal Gelfand spectrum and intuitionistic quantum logic 


We are now going to combine the (a priori independent) material in the previous two 
sections. The point of the above description of the topology @(2(A)) of the Gelfand 
spectrum ¥ (A) of a unital commutative C*-algebra A is that it may be “internalized” 
to any topos (with natural number object, i.e., in which C*-algebras may be defined 
internally in the first place). The key to the ensuing generalization of Gelfand duality 
is that in topos theory (and more generally in constructive mathematics) the space 
© (A) in set theory needs to be replaced by the corresponding frame O(2(A)), or 
preferably by its associated locale, which confusingly is denoted by (A), even 
though it is the same thing as @(Z(A)) and neither may be spatial (in being the 
topology of some space); see §C.11 and 8E.4 for this bizarre notation. Similarly, we 
write f : X — Y for a map between locales, which is essentially the same as the 
frame map f—!: G(Y) + @(X), but seen as a map in the opposite direction (where 
once again nothing is assumed about possible spatiality of the frames in question). 

Using this notation, the constructive Gelfand isomorphism (which is valid in 
any topos T in which commutative C*-algebras make sense) states: 


Theorem 12.14. For each (internal) commutative unital C*-algebra A in T there 
exists a compact regular locale (A) such that one has a Gelfand isomorphism 


A®C(Z(A),C). (12.78) 


Furthermore, the locale £(A) is uniquely determined by A up to isomorphism and 
its corresponding frame is given by Theorem 12.8 (or, more explicitly, by (12.75) in 
conjunction with Lemma 12.13, all of which makes sense internally). 


Here = denotes (internal) isomorphism of (commutative) C*-algebras, and the no- 
tation C(Z(A),C) stands for the object of all frame maps from @(C) to @(2(A)) 
(which object turns out to be a commutative C*-algebra in any case). As usual, we 
denote the Gelfand transform A > C(2(A),C) by aH 4G, where, as explained above, 
the locale map 4: X(A) — C is really the reverse reading of the frame map 


a@-!: 6(C) > G(Z(A)). (12.79) 


Note that in Sets, the latter is given by its literal meaning, given @: @ +> (a). 

We will shortly apply this formalism to our internal C*-algebra A in the topos 
T(A), but since these computations are a bit involved, as a warm-up we first apply 
our machinery to a very simple case, namely A = C” in Sets. Recall (12.44) etc. 


For A = C" we have A* = (R")*, in which (r1,...,7) © (51,---,5n) just in case 
r; = Oiff s; =0 for alli = 1,...n. Hence each equivalence class under ~ has a unique 
representative of the form [k1,...,k,] with k; = 0 or k; = 1; the pre-images of such 


an element of L4 in A* under the natural projection At —> A*/ ~ are the diagonal 
matrices whose i’th entry is zero if k; = 0 and any nonzero positive number if k; = 1. 
The partial order in L, is pointwise, ie. [ky,...,kn] < [h,.--,ln] iff ki < 1; for all i. 
Hence Len is isomorphic as a distributive lattice to the lattice Y(D,(C)) = A(C"”) 
of projections in D,,(C), i.e. the lattice of diagonal projections in M,,(C). 
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Under this isomorphism, [k),...,kn] corresponds to the matrix diag(k),...,k,). If 
we equip Y(C") with the usual partial ordering of projections on the Hilbert space 
C”, viz. e < f whenever eC” C f C” (which coincides with their ordering as element 
of positive cone of the C*-algebra M,,(C)), then this is even a lattice isomorphism. 
Hence by (12.77), the frame @(2Z(C”)) consists of all sets of the form le, e € 
YC"), partially ordered by inclusion. This means that 


G(E(C")) =~ PC"), (12.80) 


under the further identification of |p C A(C") with p ¢ A(C"). This starts out 
just as an isomorphism of posets, and turns out to be one of frames (which in the 
case at hand happen to be Boolean). To draw the connection with the usual spectrum 
C" = {1,2,...,n} of C”, we note that the right-hand side of (12.80) is isomorphic to 
the discrete topology @ (C”) of C” (i.e. its power set) under the frame isomorphism 


FC") + O(c”); 
diag(k1,...,kn) > {ie {1,2,...,2} | ki = 1}. (12.81) 


We now describe the Gelfand transform (12.78) - (12.79) for self-adjoint a, so 
that one has a (locale) map Asa > C(Z(A),R). Let a = (a1,...,an) € Ch, = R". 
With £(C”) realized as C”, this just reads G(i) = a;, for 2: C” > C. The induced 
frame map 4—! : G(C) + 6(C") is given by U+> {i€ {1,2,...,n} | a; € U}, and 
by (12.81), this is equivalent to 


a@':6(R) > A(C"); 
U + Iy(a), (12.82) 


where U € @(R), and the right-hand side denotes the spectral projection ly (a) 
defined by the self-adjoint operator a on the Hilbert space C”. 

After this warm-up, we now compute the Gelfand spectrum @(2(A)) in our topos 
T(A), for the special case A = M,,(C) (which is still an exercise for the general case). 
For simplicity we write L for the lattice L4 in T(A); similarly, Y stands for Y(A). 

First, for arbitrary A, the lattice functor L can be computed “locally”, in the sense 
that L)(C) = Lc, see Proposition 12.17 in §12.4 below, so that by (12.44) one has 


L)(C)=Ct/s. (12.83) 


Let Y(C) be the (Boolean) lattice of projections in C, and consider the functor 


PoC) = AC); (12.84) 
A (CCD) =(AC)o A(D)). (12.85) 


As in the case A = C” just discussed, it follows that we may identify Ly(C) with 
YC) and hence we may and will identify the functor ZL with the functor Y. 
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Second, whereas in Sets eq. (12.77) makes @(Z) a subset of L, in the topos 
T(A) the frame @(Z) is a subobject G(£) — Q+ It then follows from (12.11) that 
O(Z)(C) isa subset of Sub(.A,¢), the set of subfunctors of the functor FY : @(A) > 
Sets restricted to }C C @(A). To see which subset, define 


Suby (Pic) = {§ € Sub( Py) | VD DC Axp € A(D) : S(D) =|xp}. (12.86) 


Thus Subg(A7 4c) consists of subfunctors S of Arc that are locally down-sets. It 
then follows from (12.77) and the local interpretation of the relation < in T(A) (see 
Lemma 12.18 in §12.4 below) that the subobject @(£) > Q in T(A) is the functor 


O(Z)o(C) = Subs( Py); (12.87) 
O(Z)1(C CD) = (@(Z)(C) > @(Z)(D)), (12.88) 


where @(Z), is inherited from Q£ (of which G(Z) is a subobject), and hence is just 
given by restricting an element of @(Z)(C) to ¢ D. Writing 


Suby(Y) = {§ € Sub(Y) | VD € @(A) Ixp € A(D): §(D) =|xp}, (12.89) 


it is convenient to embed Subg(.Ac) C Suby(Y) by requiring elements of the 
left-hand side to vanish whenever D does not contain C. We also note that if 
is to be a subfunctor of Y+-, one must have $(D) C S(E) whenever D C E, and 
that | xp Cl xg iff xp < xg in A(E). Thus one may simply describe elements of 
O(Z)(C) via maps S: @(A) + A(A) such that: 


S(D) € A(D); (12.90) 
S(D) =0 if D¢tC(ie. CLD); (12.91) 
S(D) < S(E) if CCDCE. (12.92) 


The corresponding element § of O(Z)(C) is then given by 
S(D) = |S(D), (12.93) 
seen as a subset of Y(D). Hence it is convenient to introduce the notation 
O(Z)re = {8 : tC > P(A) | S(D) € A(D), S(D) < S(E)if DCE}, (12.94) 
of which we single out the case C = C- 1,, which will be of great importance: 
O(Z) ={S: @(A) + A(A) | S(C) € A(C), S(C) < S(D) if CCD}. (12.95) 


Both are posets and even frames in the pointwise partial order with respect to the 
usual ordering of projections (which algebraically means e < f iff ef =e), 1.e., 


S<T & S(C) <T(C) forall CE G(A). (12.96) 
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In terms of (12.94) - (12.95), we 


Of 


More importantly, the frame @ 
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then have isomorphisms 


Jo(C-1) = A(z); 
O(Z)(C)o = O(Z)tc. 


(12.97) 
(12.98) 


(Z) in Sets is the key to the external description 


of the internal frame G(Z) in T(A); see the end of 8E.4. Since @(A) carries the 


Alexandrov topology, by (E.84) 


this description is given by the frame map 


m;|:6(€(A)) > 6(2), (12.99) 
given on the basic opens }D € @(@(A)) by 
ms" (tD) = xyp : E41 (ED): 
E40 (EZD). (12.100) 


As explained before, even in Sets, in principle @(Z) is just a notation for a frame, 
without suggesting that there exists an underlying space 2 whose topology it is. 
In this case, however, there is such a space (as we shall show in the next section), 
and also (12.99) is in fact the inverse image map to a genuine map ay : 2 + @(A) 
between spaces (as opposed to the formal notation used for a locale map). 


We now state the Heyting alg 


ebra structure of @(L). First, top and bottom are 


(C) = 1 forall C; (12.101) 
L(C) = 0 for all C. (12.102) 
The logical operations on @(Z) may be computed from the partial order as 
(SAT)(C) =S(C)AT(C); (12.103) 
(SVT)(C) =S(C)VT(C); (12.104) 
PC) 
(S--»T)(C)= /\ S(D)-VT(D); (12.105) 
DLC 
PC) 
(-s)(c)= A Si)’; (12.106) 
DLC 
P(C) PD) 
(378) (C Ee VS (12.107) 
DDC EDD 


where the right-hand side of (12. 


PAC) 


\\ S(D)- VT 


DDC 


D)=\/{ee AC 


105) (and similarly (12.106) - (12.107)) is short for 


)|e<S(D)'VT(D)VDDC}. (12.108) 
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Recall that a Heyting algebra is Boolean iff ~—~S = S for each S. One sees from 
(12.107) that (at least if n > 1) the property ——$ = S only holds iff S is either T or 
L, so that the Heyting algebra @(£) = CO(Z(A)) is properly intuitionistic. 

Since from both a physical and a logical point of view the Heyting algebra 
O(Z(A)) has vast advantages over the projection lattice Y(A) of Birkhoff and von 
Neumann, we propose it as a candidate for a new quantum logic. Let us explain why. 

Physically, in von Neumann’s approach each projection e € A(A) defines an 
elementary proposition, whereas in Bohr’s (where the classical context C is crucial) 
an elementary proposition is a pair (C,e), where e € A(C) is a proposition a la von 
Neumann (who lost sight of the context C). If for each such pair (C,e) we define 


Se): @(A) + P(A); (12.109) 
Dre (CCD); (12.110) 
D+> L otherwise, (12.111) 


we see that each pair (C,e) injectively defines an element of @(2). Furthermore, 
each element S of @() is a disjunction over such elementary propositions, since 


S= V Ses): (12.112) 
CE@(A) 


In contrast to traditional quantum logic, both logical connectives \ and V on @(L) 
are physically meaningful, as they only involve local conjunctions S(C) A T(C) and 
disjunctions S(C) V T(C), for which S(C) € A(C) and T(C) € A(C) commute. 

Logically, the absence of an implication arrow in quantum logic has always been 
worrying; this has now been put straight in @(£), where --> belongs to the defining 
structure and behaves well logically. Truth attribution in quantum logic is equally 
suspicious: for any state @ on A one declares a proposition e € Y(A) true iff w(e) = 
1, and false iff w(e) = 0, with no verdict otherwise (except probabilistically). 

We, however, define a natural Kripke semantics (cf. §D.3) on P = @(A) by 


(12.113) 


Von: O(Z) — Upper(@(A)) = O(@(A)); 
1}, (12.114) 


Voo(S) = {C € @(A) | @(S(C)) = 


where @(A) carries the Alexandrov topology as usual. Note that V(S) indeed de- 
fines an upper set in @(A), for if C C D then S(C) < S(D), so that @(S(C)) < 
@(S(D)) by positivity of states, and hence @(S(D)) = 1 whenever @(S(C)) = 1 
(given that @(S(D)) < 1, which is true since 0 < @(e) < 1 for any projection e). 
As explained in §D.3, a proposition S € @(Z) is true in a state @ if Vo(S) = 
@ (A), i.e. the top element of the frame O(@(A)); we also declare it false if V(S) = 
0, i.e. the bottom element of @(@(A)). Then —S is true iff S is false, and SV T is true 
iff either S or T is true (since Vig(S) = @(A) iff S(C- 1) = 1, which forces S(C) = 1 
for all C). Consequently, (12.114) simply lists the contexts C in which S(C) is true. 
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12.4 Internal Gelfand spectrum for arbitrary C*-algebras 


In this section we compute the internal Gelfand spectrum 2(A) = X in T(A) for an 
arbitrary unital C*-algebra A. Recall Definition D.6 (in §D.1) of a free lattice %% 
on a set S, and its refinement in quotienting by a congruence on 5 explained after 
that definition. According to Definition E.21, lattices can be defined in any topos. 
The following “locality lemma” shows that the construction of a free lattice on some 
object makes sense in functor toposes, and so does its refinement just mentioned, at 
least as long as the congruence in question is defined through equalities. 


Lemma 12.15. Let T = [C, Sets] be any functor topos (where C is some category). 
1. There exists a free distributive lattice “gs € T on any object S € T, which can be 
computed locally: the object part of Zs is given by 
(ZYs)o(C) = LZsc); (12.115) 


where Ls (c) defined in Sets, and the arrow part is defined as follows. If f :C + 
D, then (&s)1(f) is the unique arrow making the following diagram commute: 


SiC) =! 5550) 
¢g | ‘e (12.116) 
(Zs) (f) 


So(C) —__? -€S9(D) 


2. The same is true if £2 is subject to relations defined by equalities among ele- 
ments of &s (as long as these equalities generate a congruence). 


Proof. The proof is an elaborate verification, which may be summarized as follows. 


1. Existence and uniqueness of the arrow (;)1(f) in (12.116) follows from the 
universal property of the free distributive lattice Zs, (c) in Sets; just consider 
the function #0 S,(f) : So(C) + -Zs,(p). The claim follows from the fact that 
-£ (defined locally) has the required universal property (as can be established 
locally, from the corresponding property of each (-;)o(C)) and hence is unique. 

2. This is proved in a similar way, since also a free distributive lattice %/~ on 
generators S with relations given by equalities has a universal property, cf. the 
final part of §D.1. This works locally in a functor topos by rule no. 7 of Kripke— 
Joyal semantics, cf. 8E.5 (which states that equalities are enforced locally). 


We will apply this lemma to T = T(A), as in (12.1), with C = @(A). This hinges ona 
lemma of independent interest, which we first state for Sets, i.e., for “ordinary” com- 
mutative unital C*-algebras A, to be subsequently internalized to our topos T(A). 


Lemma 12.16. The lattice La in (12.44) is (constructively) isomorphic to the lattice 
L', freely generated by the symbols Da, a € Asa and the relations (12.45) - (12.50). 
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Proof. The point is that the map a++ Dg from Aga to L/, is surjective; this follows 
from the relations (12.45) - (12.50) through their consequences (12.51) - (12.55). 
The pertinent isomorphism L/, © L, is then given by mapping Dy + [a*] on gener- 
ators (note that in the original discussion of L4 following (12.44) this map was the 
definition of Dg; this time, these play an independent role as generators of the lattice 
L',, and in the present proof they are related to the elements [a*] € La). 


Now let A be a (not necessarily commutative) unital C*-algebra (in Sets), with 
ensuing internal commutative C*-algebra A in the functor topos T(A), cf. Theorem 
12.3. Our goal is to apply the constructive definition of the Gelfand spectrum 2 (A), 
or rather of its topology @(Z(A)) (seen as a frame, so that L(A) is seen as a locale) 
in §12.2 to A. The first step concerns the lattice L4, which in T(A) is denoted by Ly. 
Here and in what follows, we try to avoid notational confusion by writing D, for the 
formal variable indexed by a (which is a variable of type A in T(A)), whilst writing 
D, for the actual element [ct] of Lc if we apply (12.44) etc. toC € @(A). 


Proposition 12.17. For each C € @(A) one has 


La(C) =Lec, (12.117) 


where Lc is defined in Sets through (12.44) (with A ~» C), where it may be computed 
through Lemma 12.16. Furthermore, if C C D, then the map L4(C) > L,(D) given 
by the functoriality of Ly, i.e., Lc + Lp, maps each generator D, in Le (where 
C € Cg) to the same generator in Lp. This is well defined, because c € Dga, and this 
inclusion preserves the relations (12.45) - (12.50). We write this as Lc — Lp. 


Proof. Internalizing Lemma Lemma 12.16 to our functor topos T(A), it follows that 
the internal lattice L, in T(A) is isomorphic to a distributive lattice freely generated 
by generators and relations given by equalities. Hence Lemma 12.15 applies to it. 


The next step is to move from Ly to the corresponding frame of regular ideals, 
cf. Theorem 12.8. Abbreviating @((A)) = @(Z), we first rewrite (12.60) as 


O(Z) = {U €ldl(L4) | Vgs0Da—-g € U > Dy EU}. (12.118) 


To apply this to our functor topos T(A), we apply Kripke—Joyal semantics for the 
internal language of the topos T(A) (which is reviewed 8E.5) to the formula Dz <1 U. 
This is a formula @ with two free variables, namely D, of type L4, and U of type 


P(La) = Q", (12.119) 
Hence in the forcing statement C lt @(a@) in T(A), we have to insert 
a € (Ly x Q)(C) & Le x Sub(Lajc), 
where Lajyc is the restriction of the functor 


L,: @(A) — Sets (12.120) 
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to +}C Cc @(A). Here we have used (12.117), as well as the isomorphism (12.11). 
Consequently, we have 
@ = (De,U), (12.121) 


where D. € Lc for some c € Cgg, and U : + C — Sets is a subfunctor of Late: In 
particular, U(D) C Lp is defined whenever D > C, and the subfunctor condition on 
U simply boils down to U(D) C U(E) whenever C CD CE. 


Lemma 12.18. In the topos T(A), the cover <1 of Lemma 12.13 may be computed 
locally, in the sense that for any C € @(A), De € Lc and U € Sub(La\yc), one has 


ClE Dg A U(D¢,U) iff De <ic U(C), 
in that for all q > 0 there exists a finite Up C U(C) such that De—q < \V Uo. 


Proof. We assume that \/Up € U, so that we may replace Up by D, = \/ Up; the 
general case is analogous. We then have to inductively analyze the formula D, <1 U, 
which, under the stated assumption, in view of Lemma 12.13 may be taken to mean 


Vq>0Sp,ez, (Dp € U A Dag < Dp). (12.122) 
We now infer from the rules for Kripke—Joyal semantics in a functor topos that 
Clk (Da € U) (De, U) (12.123) 


iff for all D > C one has D, € U(D); since U(C) C U(D), this happens to be the 
case iff D. € U(C). Furthermore, 


Clk (Dy < Da)(De,De) (12.124) 


iff Du < D; in Le. Also, 


Clk (Ap,er, Dp € U ADa-q < Dy)(De,U) (12.125) 


iff there is Dy € U(C) such that Deg < Dy. Finally, 


CIF (Vgs0dp,er, Ds € U A Da-g < Dy) (Dc, U) (12.126) 


iff for all D > C and all g > 0 there is Dy € U(D) such that Deg < Dg, where 
D, € Lc is seen as an element of Lp through the injection Lc > Lp of Proposition 
12.17, and U € Sub(Ly);c) is seen as an element of Sub(L4);p) by restriction. This, 
however, is true at all D > C iff it is true at C, because U(C) C U(D) and hence one 
can take Dg = D. for the D.. € Lc that makes the condition true at C. 


Lemma 12.19. The spectrum O(£) of A in T(A) may be computed as follows: 


1. AtC € @(A), the set O(Z)(C) consists of those subfunctors U € Sub(La}4c) such 
that for all D > C and all Dg € Lp one has: 


Da <p U(D) > Dg € U(D). 
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2. AtC-1, the set O(Z)(C- 1) consists of those subfunctors U € Sub(L,4) such that 
for all C € @(A) and all D, € Le one has: 


De <ic U(C) > De € U(C). 


3. The condition that U = {U(C) © Le}ce¢ (a) be a subfunctor of L, comes down 
to the requirement that: 


CCD=U(C)CU(D). 


4. The map O(Z)(C) > G(Z)(D) given by the functoriality of O(Z) whenever 
C C Dis given by truncating an element U :+C — Sets of G(Z)(C) to t D. 
5. The external description of @(2) is the frame map 


ni: O(@(A)) + O(Z)(C- 1), (12.127) 
given on the basic opens tD € O(@(A)) by 


T(tD) = Xp: ET (EDD); 
EW (EZD), (12.128) 


where the top and bottom T , 1 at E are given by {Lg} and , respectively. 
Proof. By (12.75), G(Z) is the subobject of Q44 defined by the formula @ given by 
Vp,el, Da JU > Da CU, (12.129) 


whose interpretation in T(A) is an arrow from Qt to Q. In view of (12.11), we 
may identify an element U € @(Z)(C) with a subfunctor of Lajyc, and by (12.129) 
and Kripke—Joyal semantics in functor topoi, we have U € @(Z)(C) iff Cl p(U), 
with @ given by (12.129). Unfolding this using Kripke—Joyal semantics and using 
Lemma 12.18 (including part 1 of its proof), we find that U € G(Z)(C) iff 


VpocV yelp WEDD Da <ig U(E) > Da € ULE), (12.130) 


where Dy is regarded as an element of Lz. This condition, however, is equivalent to 
the apparently weaker condition 


Vp2cVD,€Lp Da <Ilp U(D) => Dy €U(D); (12.131) 


indeed, condition (12.130) clearly implies (12.131), but the latter applied at D = E 
actually implies the first, since Dg € Lp also lies in Le. 

Clauses 2 to 4 should now be obvious. Clause 5 follows by the explicit prescrip- 
tion for the external description of frames (which has been recalled in the previous 
section, after its initial description the end of §E.4). Note that each O(Z)(C) is a 
frame in Sets, inheriting the frame structure of the ambient frame Sub (Lajtc)- 


480 12 Topos theory and quantum logic 


We now present the computation of G(X) = @(Z(A)) for general unital C*- 
algebras A. To explain the final formula, topologize the disjoint union 


Pees || ey (12.132) 
CE@(A) 


where 2(C) is the Gelfand spectrum of C € @(A), as follows, abbreviating 
UWc=UNE(C). (12.133) 


One has Y € @(54) iff the following two conditions are satisfied for all C € @(A): 


1. %EO(X(C)). 
2. For all DDC, if A € Yand A’ € E(D) such that ic =A, then dA’ € %. 


In fact, (£4) is simply the weakest topology making the canonical projection 
nm: ZA + @(A); (12.134) 
n(a)=C (o E€L(C) cZ4), (12.135) 
continuous with respect to the Alexandrov topology on @(A). For U € @(@(A)), 


==.) 2@) (12.136) 
CEU 


is a subset of £4, with relative topology inherited from ¥4. In particular, for the 
basic opens U = +C of the Alexandrov topology on @(A) we have 


Léc= L] 2@). (12.137) 
D2C 


Theorem 12.20. Let A be a unital C*-algebra A. The internal Gelfand spectrum 
O(Z(A)) of our internal commutative C*-algebra A in the topos T(A) is the functor 


O(E(A))o: CH O(E4o), (12.138) 


i.e., the frame (in Sets) of the open sets of Zc in the topology defined after (12.132); 
if C CD, the arrow-part of the functor in question is given by 

O(E(A))1 : O(E%c) + O(E4p); (12.139) 

L+> UTD. (12.140) 


Similarly, in the description of T(A) as the category of sheaves Sh(@(A)), cf: (E.84), 
the Gelfand spectrum is given by the sheaf (where U C V in (12.142)): 


O(E(A))o:U + O(EH) (U € O(G(A))); (12.141) 
O(L(A))1:%H UWL (W € O(E4)). (12.142) 
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Proof. The proof is based on Lemma 12.19, which implies that the internal frame 
RIdl(Z,) in T(A) is given by the functor 


RIdl(L4) : C++ {FE € Sub(La);c) | E(D) € Ridl(Lp) for all D> C}. (12.143) 


Here, since D is a commutative unital C*-algebra in Sets, according to (12.60) the 
set RIdl(Lp) may be identified with the topology @(Z(D)), where X(D) is the 
Gelfand spectrum of D in the usual sense. We will make this identification in the 
following step, which is the last step of the proof of Theorem 12.20. 


Lemma 12.21. The transformation @ : RIdl(L,) — @(2Z(A)) with components 


Oc: {F € Sub(L4 4c) | E(D) € (E(D)) forall D DC} > O(EA,); 


F++ | | E(D), (12.144) 
DOC 


is a natural isomorphism of functors—i.e., an isomorphism of objects in T(A). 


Since RIdl(Z,) and @(Z) are internal frames in T(A), it suffices to prove that each 
Oc is an isomorphism of frames in Sets. Unfortunately, even this proof is a very 
lengthy (though straightforward) affair, for which we refer to the literature. 


Corollary 12.22. The external description (in Sets) of the internal locale Z(A) (in 
T(A)) is given by the canonical projection (12.134). 


Note that both £4 and @(A) are topological spaces, so that (12.134) is a bona fide 
continuous map between spaces. This is worth stressing, since in general, an exter- 
nal description of an internal locale in a sheaf topos, though defined in Sets, is a map 
between locales (or, equivalently, between frames) that are not necessarily topolog- 
ical spaces. But in the case (12.134) at hand they are, so at least this time there is 
no confusion between @(X) as both formal notation for a frame (not necessarily 
coming from a topology) and notation for the topology of a space X; see §C.11. 
Note that (12.95) is a special case of Theorem 12.20 or Corollary 12.22, for 


A=M,(C). (12.145) 
To see this, we identify Y = | |cegs4) Wc as an element of @ (£4) with 
S:@(A) > Y(A) 


on the right-hand side of (12.95), where S(C) € A(C) is the image of % € 
@(Z(C)) under the isomorphism @(2(C)) + A(C) between the (discrete) topol- 
ogy of the (finite) Gelfand spectrum of C and the (Boolean) projection lattice of C 
derived earlier, see (12.80). Similarly, for U € O(@(A)), the frame O(L/) may be 
identified with maps 

S:U > YA) 


satisfying the conditions in (12.95). Of course, the special case (12.145) leading to 
(12.95) is very appealing, and was well worth treating in its own right! 
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Theorem 12.20 and Corollary 12.22 also give an explicit description of the gen- 
eral internal Gelfand isomorphism (12.78), whose real part in T(A) reads 


Ay, & C(Z,R) = Frm(@(R), O(E)), (12.146) 


where the right-hand side, which denotes the object of frame homomrphisms from 
@(R) to G(Z) within T(A), is the definition of the middle term (which is just a 
notation). To understand the situation in T(A), one has to distinguish between: 


1. The object Frm(@(R), @(Z)) in T(A), defined as the subobject of the exponen- 
tial 6(£)°® consisting of (internal) frame maps from @(R) to G(Z). 

2. The set Hompym(@(R), @(Z)) of internal frame maps from the frame @(R) of 
(Dedekind) real numbers in T(A) to the frame @(Z) (i.e., the set of those arrows 
from @(R) to @(Z) that happen to be frame maps as seen from within T(A)). 


The connection between 1. and 2. is given by A-conversion, i.e., the bijective cor- 
respondence between C > B4 and Ax C > B, cf. (E.153). Taking C = 1 (ie. the 
terminal object in T(A)), we see that an element of the set Hom(A,B) corresponds 
to an arrow 1 —> BA. Eq. (12.8) yields 


Frm(6(R), O(E))(C) =Natrim(O(R)3c, (EZ )t0); (12.147) 


the set of all natural transformations between the functors G(R) and @(Z), both 
restricted to }C C @(A), that are frame maps. This set can be computed from the 
external description of frames and frame maps in 8E.4. Recall (12.4) etc. The frame 
O(R)+c has external description 


Tt! : O(tC) > O(tC xR), (12.148) 


where 7g : tC x R fC is projection on the first component. The special case 
C =C-1 yields the external description of @(R) itself, namely 


Tg! : O(€(A)) + 6(@(A) xR), (12.149) 


where this time (with abuse of notation) the projection is mp : @(A) x R > @(A). 
By Corollary 12.22, the Gelfand frame @(Z)+c has external description 


t!: O(¢C) > G(E) sc, (12.150) 


given by (12.128), with the understanding that D > C (the special case C= C- 1 
then recovers the external description (12.99) of G(Z) itself). It follows that there 
is a bijective correspondence between two classes of frame maps: 


92! O(Ryic > O(E)tc (in T(A)); (12.151) 
9c': O(¢C xR) > O(Z)yc (in Sets), (12.152) 


where Qc must satisfy the condition that for any D > C, 
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Q¢' (TD x R) = Xp. (12.153) 


Indeed, such a map 9! defines an element ge! of Nat(@(R)+c, @(Z)+c) in the 
obvious way: for D €tC, the components 


¢'(D) : E(R)(D) > @(Z)(D) (12.154) 
of the natural transformation Go ! ie. 


9.1 (D): O(tTD x R) > O(Z) pp, (12.155) 


are simply given by the restriction of GC! to O(tD x R) Cc G(+C x R); cf. (E.147). 
This is consistent, because (12.153) implies that for any U € G(R) andC CDCE, 


ge\(tE x U)(F) < 9¢'(tD x R)(F), (12.156) 
which by (12.153) vanishes whenever F 2 D. Consequently, 
Qe! (tE x U)(F) =OifF Z D, (12.157) 


so that 9,'(D) actually takes values in @(£)+p (rather than in O(L),c, as might 
be expected). Denoting the set of frame maps (12.152) that satisfy (12.153) by 
Frm’ (@(+C x R), @(Z)4c), we obtain a functor 


Frm’ (@(t (—) x R), @(Z);_) : @(A) — Sets, (12.158) 
with the stipulation that for C C D the induced map 
Frm’ (6(tC x R), O(E);c) + Frm!(6(tD x R), O(E)yp) 


is given by restricting an element of the left-hand side to @(}D x R) Cc G@(¢C x R); 
this is consistent by the same argument (12.157). 
The Gelfand isomorphism (12.78) is therefore a natural transformation 


A—> Frm'(@(t— x R), O(Z)+_), (12.159) 
which means that one has a compatible (i.e. natural) family of isomorphisms 


C —> Frm'(@(tC x R), O(E) 40); 
avy &':6(tCxR) > G(Z)tc. (12.160) 


On basic opens }D x U € G(*+C x R), with D D C, we obtain 


a@'(tDxU): E+ ly(a) ifE DD; 
ES oifE DD. (12.161) 
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Here 1y(a) is the spectral projection of a in U, cf. (12.82); as it lies in A(C) and 
C CD CE, the projection ly (a) certainly lies in A(E), as required. Furthermore, 
we need to extend @~! to general opens in tC x R by the frame map property, and 
note that (12.153) for 9! = @' is satisfied. 

This analysis also holds in the topos Sh(@(A)) of sheaves in @(A) (as always, 
equipped with the Alexandrov topology, cf. (E.84). It then follows from (12.159) 
and (12.141) that as a sheaf, 


C(Z,C):U + C(z4,C), (12.162) 


where Sf is given by (12.136); if U C V, the map C(Z#,C) > C(Z4,C) is given by 
the pullback of the inclusion x + Dee (that is, by restriction). It then follows from 
(12.162) that the isomorphism (12.146) is given by its components 


A(U) = C(Z4,C). (12.163) 
In particular, the component of the natural isomorphism in (12.146) at U = tC is 
C= C(ZH,C). (12.164) 


A glance at the topology of £4 shows that the so-called Hausdorffication, which 
for a general compact space may be defined either directly, or C*-algebraically by 
X# = Y(C(X)), and coincides with the left adjoint of the forgetful functor from 
the category of compact Hausdorff spaces (and continuous maps) to the category of 
compact spaces (and continuous maps), is given by (Z40)# = Z(C), so that 


C(Z¥c,C) = C(Z(C),C), (12.165) 
where the isomorphism is given by restricting f € C (Zhe, C) to L(C) c rho 
Corollary 12.23. The internal Gelfand isomorphism 

A—+C(Z,C), (12.166) 


which is a natural isomorphism between functors @ (A) —> Sets, is given at each 
C € @(A) by the usual Gelfand isomorphism for the commutative C*-algebra C: 


Ao(C) =C —+ C(Z(C),C) = C(Z,C)o(C). (12.167) 


At the end of the day, the Gelfand isomorphism (12.146) therefore turns out to 
simply assemble all isomorphisms (12.167) for the commutative C*-subalgebras 
C of A into a single sheaf-theoretic construction. Incidentally, taking C = C-1 in 
(12.164) shows that (£4)” is a point, which is also obvious from the fact that any 
open set containing the point ©(C-1) of £4 must be all of £4. 
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12.5 “Daseinisation” and Kochen-Specker Theorem 


The internal Gelfand transform (12.166) constructed in the previous section acts on 
each commutative subalgebra A € @ (A). What about A itself? There is a more subtle 
transform, inspired by the remarkable “Daseinisation” construction of Doring and 
Isham (whose name has unfortunately been inspired by the controversial German 
philosopher Heidegger), which turns self-adjoint elements a of A into continuous 
functions 5(a) on the topos-theoretical phase space £4, whose range is the so-called 
interval domain IR (which is a fuzzy version of IR). Hence we will define a map 


5: Asa > C(L4, IR), (12.168) 


which, alas, is defined only if A is a von Neumann algebra; we shall therefore as- 
sume this throughout this section. Similarly, the notation @(A) will now stand for 
the poset of abelian von Neumann subalgebras of A (as opposed to abelian C*- 
subalgebras of A, as in the remainder of this book). 

“Daseinisation” requires two slightly unusual concepts, the first of which is the 
said interval domain IR. To motivate its definition, consider Brouwer’s approxima- 
tion of real numbers by nested intervals with endpoints in Q. For example, the real 
number 7 can be described by specifying the sequence 


[3,4], [3.1,3.2], [3.14,3.15], [3.141,3.142],... 


This description of the reals is formalized by IIR, defined as the poset whose ele- 
ments are compact intervals [a,b] in R (including singletons [a,a] = {a}), ordered 
by reverse inclusion (for a smaller interval means that we have more information 
about the real number that the ever smaller intervals converge to). This poset is a 
so-called dcpo (for directed complete partial order); directed suprema are simply 
intersections. As such, it carries the Scott topology, whose open sets are upper sub- 
sets U of IR with the additional property that for every directed set D with \/D € U 
the intersection DMU is nonempty. This means that each open interval (p,q) in R 
(with p = —co and g = + allowed) corresponds to a Scott open 


Upq) = {la,b] | p<a<b<q}. (12.169) 


Indeed, these opens form a basis of the Scott topology @scor (IR) = @ (IR) of IR. 
This topology is, of course, a frame, so far defined in Sets. However, this frame is 
easily internalized to any (pre)sheaf topos, similar to the Dedekind reals (12.3) - 
(E.149); in particular, in T(A) we have 


O(IR)o: C++ G((tC) x IR), (12.170) 
with external description as a locale (see 8E.4) given by the canonical projection 


m :6(A) x IR @(A). (12.171) 
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The second ingredient of “Daseinisation” is the spectral order on Aga. The par- 
tial order < defined in §C.7 (in which a < b iff @(a) < @(b) for all states @ on A) 
has good linearity properties in that it makes A, a convex cone in the real vector 
space Aga (cf. Definition C.50), but it is terrible from a lattice point of view (unless 
A is abelian): for example, for A = B(H), suprema a V b and infima a/ b exist iff 
either a < b or b <a (and indeed Ag, is a lattice with respect to < iff A is abelian). 
However, there is a different order on A,, that turns it into a conditionally (or bound- 
edly )complete lattice, i.e., a poset X with the property that if some subset S C X has 
an upper bound (i.e., there is x € X such that s < x for each s € S), then it has a 
lowest upper bound (i.e., \/ S exists), and similarly for (greatest) lower bounds. 


Definition 12.24. For a,b € Aga we say that a <, b (i.e., a is less or equal than b in 
the spectral order) iff a" < b" for eachn EN. 


It can be shown that a <, b iff 2. < ett) for each A € R (note the change of order), 


(a) 


where en) is the spectral projection 1(_...ajno(a) (a), etc. This, in turn, implies, that 
a <sb iff Hola <A) > Malb <A), (12.172) 

for each (normal) state @ on A and each A € R, where 
Mo(a < A) = @(1(—2,a}no(a) (4) (12.173) 


is the Born probability for the outcome a < A in state @ (and similarly for b). Fur- 
thermore, if a and b commute, or if a and b are both projections, the a <,; biffa <b, 
i.e., <; coincides with the usual partial order < iff A is abelian, and <, restricts to 
< on the projection lattice A(A) of A. For each a € Ag, and C € @(A), we define 


di. (a) =\/{b € Coa | b <5 a} (12.174) 
5e(a) = [\{b € Coa | a Ss Df, (12.175) 


called the inner and outer Daseinisation of a with respect to C, respectively; those 
objecting to Heidegger might prefer to simply call these the inner and outer local- 
izations of a with respect to C. For projections, these expressions simplify to 


di(e) =\Vif € PC) | Ff <s eh; (12.176) 
dé(e) = \{f € AC) le <s FI, (12.177) 


and in fact one has a very nice categorical description, in that 64: A(A) > A(C) 
and 62: P(A) + A(C) are the right and left adjoint, respectively, of the inclusion 
functor Y(C) — Y(A) in the category of complete orthomodular lattices. 

We are now in a position to define the map (12.168): for a € Asa we put 


5(a): (C,@) + [@(d2(a)), @(62(a))], (12.178) 


where (as the notation indicates) the point (C,@) € £(C) Cc ¥4 is just @ € Z(C). 
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It is easily checked that the right-hand side of (12.178) makes sense, since positivity 
of states and (12.174) - (12.175) obviously imply @(6.(a)) < @(62(a)). Also, 6(a) 
is continuous, so that 6 is well defined. If we define a closely related map 


(a): =4 > @(A) x IR; (12.179) 
5(a)(C,@) = (C,6(a)(C, @)), (12.180) 


then 6 (a) is the external description of an internal locale map 


6(a): (A) 3 IR. (12.181) 


In view of this, we may regard (12.168) as a hybrid (i.e. “category mistake”) map 
5: Asa > C(E(A), IR); (12.182) 


see the text below (12.146), with R ~» IR, for the meaning of the right-hand side. 
The relationship between 6 and the Gelfand transform (12.166) is as follows. 
For a € Aga, let W*(a) be the unital commutative von Neumann algebra generated 
by a =a* and 1, within A. Using (12.164), we then have a Gelfandish isomorphism 
W*(a)sa —> C(Ethye(ay sR): (12.183) 
che. (12.184) 

In particular, since a € W*(a), we obtain a continuous function 

~. yA 
A: Law (a) —R. (12.185) 


Furthermore, we have an inclusion 


1: ROIR: (12.186) 
x+> [x,x], (12.187) 


which is continuous, and hence induces a map C(£4,R) > C(£4, IIR), as well as 
maps C (ZAy+(a) JR) -C (ZAy+(a) UR). Then the following diagram commutes: 


i 5(a) 
x hw (a) — IR 
ot hs (12.188) 
R 


In words, the restriction of the “Daseinisation” 5(a) : £4 — IIR of a to the open 
subset rhw*(a) Cc XA takes values in R C IR, and as such coincides with the Gelfand 
transform @ of a, seen as a map (12.185). Hence, as might be expected in quantum 
mechanics, any fuzziness of 5(a) is only noticeable outside its own context W* (a). 
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The “Daseinisation” construction enables one to interpret propositions a € (p,q) 
as open subsets of the “phase space” ©, as in classical physics, where a: X > R 
would be a continuous function on a phase space X, and one would say that 


[Ia € (vp, a)]lom = '(p,q) € O(X). (12.189) 


In quantum mechanics, one would interpret a € (p,q) as the spectral projection 


[la € (v. allow = e(9).) = Mpgynotay(a); (12.190) 


or, equivalently, with the corresponding closed subset of the ambient Hilbert space. 
In our quantum toposophy setting, however, we may adapt (12.189) as 


[Ia € (p.a)llor = 8(a)~! (Ugg) € O(E*). (12.191) 


Similarly, one may interpret a € (p,q) as an internal open subset of the internal 
Gelfand spectrum £(A), as follows. For any locale Y in a topos T, an internal open 
in @(Y) is defined as an arrow 1 -> G(Y), where as usual 1 is the terminal object in 
T. In the case at hand we have Y = (A), and use the composition 


: a)! 
128 aR) d(a) 6(Z(A)), (12.192) 


where the natural transformation (p,q) has components 


(P,9) (*) = TEx Uipq)s (12.193) 


cf. (12.170), and 5(a)~!: @(IR) + @(Z(A)) is the frame version of the locale map 
(12.181), whose component at C, i.e., 


(a)! : O((TC) x IR) + O(E%o), (12.194) 
is given on basic opens in (¢C) x IR, with D > C and p < q, by 
5(a)g! (1D x Up gy) = 6(a)"(Up.g)) NE4p- (12.195) 
We therefore obtain the quantum-toposophical interpretation of a € (p,q) as: 


[la € (p, q)]lor: 1 @(Z(A)); (12.196) 
[a € (p.@) ler = 5(a)~! 0 (p,q). (12.197) 


We are now going to combine this expression with a construction relating states 
@ € S(A) to arrows from @(Z(A)) to the truth object Q in T(A). This construction 
generalizes the fundamental bijective correspondence between states on commuta- 
tive (unital) C*-algebras A and probability measures on its Gelfand spectrum £(A) 
(cf. Theorem B.24) to the non-commutative case. 
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To this end, we first need to replace probability measures on spaces by probability 
measures on locales. This, in turn, requires the lower real numbers R,, which may 
be identified with proper subsets x; C Q with the following two properties: 


1. If p € x;, then there exists g € x; with p < q. 
2. If p <q €-x;, then p € x; (Le., x; is a lower subset of Q). 


In Sets, the lower reals may be identified with R (in Hilbert’s definition) by identi- 
fying x; with its supremum x = sup.x;, but in arbitrary toposes (that admit internal 
natural and hence rational numbers) they drift apart. Similarly, one defines the upper 
real numbers R,, as proper upper subsets x, C Q such that p € x, implies that there 
exists q € x, with p > q; once again, in Sets, R,, may be identified with Hilbert’s R 
by taking x = infx,. The Dedekind real numbers Ra, then, are pairs (x;,x,) where 
x; € R; and x, € R, are such that x; x, = and for each p,g € Q with p < q, either 
Pp € x; or gq € X,. In Sets these may be identified with supx,; = infx, = x, so that 
Ry = R, but in many toposes R;, R,,, and Rg are all different. For example, we have 
already seen that in sheaf toposes Sh(X), the Dedekind reals are given by the sheaf 
(E.150), but the lower reals turn out to be defined by 


(Ryo :U 4 L(U,R), (12.198) 


where U € @(X) and L(U,R) is the set of all lower semicontinuous functions from 
U to R that are locally bounded from above (and similarly for R,,, mutatis mutandis). 
In particular, in T(A) we have the functor 


(R;)o: C++ L(t C,R), (12.199) 


which is quite different from (12.3) (and similarly for R,,). 


Definition 12.25. A probability measure on a locale X is a monotone map 
Le: G(X) > [0,1]}), (12.200) 


where [0,1]; is the collection of lower reals between 0 and 1 (defined by replacing 
Q in the definition of R; by the set of all rationals 0 < q < 1), that satisfies 


H(T) = 1; (12.201) 
w(U)+n(V) =B(U AV) +B(UVV); (12.202) 
u (Vata) = Vau(U,), (12.203) 


for any directed family (U;) in O(X). 


Compared with (probability) measures on o-algebras, we see that (probability) mea- 
sures on locales are merely defined on open sets (as opposed to measurable sets, 
which include opens), but this weakening is compensated for by the much stronger 
(i.e. uncountable) additivity axiom (12.203). Indeed, in Sets, if X is a compact Haus- 
dorff space, one even has a bijective correspondence between regular probability 
measures Ll’ on X as a space and probability measures 1 on X as a locale. 
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This definition makes sense in constructive mathematics, and hence it may be in- 
ternalized to T(A). Doing so, probability measures on the internal Gelfand spectrum 
~ (A) turn out to correspond to the following notion (cf. Definition 2.26). 


Definition 12.26. A quasi-state on a unital C*-algebra A is amap 0: A — C that is 
positive and normalized (@(1,4) = 1), satisfies @(b + ic) = @(b) +iw(c) for b* =b 
and c* =, and is linear on each commutative unital C*-algebra in A. 


Theorem 12.27. There is a bijective correspondence between quasi-states @ on A 
and probability measures Hon the internal Gelfand spectrum (A). 


The proof uses the fact that given the (Alexandrov) topology on @(A), a function 
+C — (0, 1] is lower semicontinuous iff it is order-preserving (i.e., monotone); since 
(0, 1] is bounded, the condition of local boundedness is trivially satisfied and hence 
L(*C,[0, 1]) consists of all order-preserving functions from +C Cc @(A) to [0, 1]. 


Proof. Any probability measure on (A) is a natural transformation 


w:X(A) > [0,1] 


a borip 


(12.204) 


whose component at C € @(A), according to (12.138) and (12.199), is a map 
HM .: O(Ztc) + LtC, (0, 1)), (12.205) 


satisfying properties dictated by Definition 12.25. In particular, if C is maximal 
abelian in A, then by the comment preceding the proof, Ho is simply a function 
O(Z(C)) — [0,1] that satisfies (12.201) - (12.203) and hence is a (regular) proba- 
bility measure Uc on L(C). Thus by Riesz—Markov one obtains a state @c on each 
maximal abelian C. From the topology on 4 and (12.137) we see that if D is not 
maximal, | ‘ is determined by e for any C D D, so that we also obtain a proba- 
bility measure Up on Y(D), or, equivalently, a state Wp, by restriction of @c to D. 
One might fear that Up and @p could depend on the chosen embedding D C C, but 
naturality of 4 implies that if D C C as well as D CC’, where both C and C’ are 
maximal, then the ensuing measures [lp are the same. This implies the same prop- 
erty for the corresponding states @p, which in turn shows that the collection of all 
Up and Uc thus obtained organizes itself into a single quasi-state @ on A. 
The converse follows by running this argument backwards. 


Combining (12.196) with Theorem 12.27, we obtain a state-proposition pair- 
ing that is no longer probabilistic, as in ordinary quantum mechanics, but defines a 
proposition in the internal language of T(A) and as such may or may not be true at 
each stage C € @(A). The final ingredient for this is an arrow 


1: Z(A) > [0,1], (12.206) 
defined by its components 1¢: 0 (Zc) — L(tC, [0, 1]) that map each open subset of 


ve to the constant function on tC taking the value 1 € [0, 1]. The internal language 
of T(A) (cf. §E.5) turns this into a formula hos 1 with the following interpretation: 
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(!, = i]: £(A) + 2. (12.207) 
We combine this with (12.196) so as to obtain an internal state-proposition pairing 


[[H.(@€ (P,9)) = Mler:1> 2, (12.208) 


where we have abbreviated 


[[H,, (4 € (p.9)) = ler = (lH, = Nella € (p, a)ler- (12.209) 


The truth of the proposition (12.208) at stage C may be determined from Kripke— 
Joyal semantics; a straightforward computation for A = B(H) shows that 


Clr uy (ae (p.9@)) =1 (12.210) 


iff there exists a projection e € Y(C) withe < ae and @(e) = 1. Assuming @ is 
a vector state @(a) = (w,aw) for some unit vector y € H, this means that (12.210) 
holds iff y € eH C got for some e € A(C), ie., if the proposition a € (p,q) has 
(Born) probability one in state y and there is a yes-no measurement in context C 
verifying this probability. In comparison, in classical mechanics a pure state x € X 
makes a € (p,q) true iff a(x) € (p,q), where a € C(X,R) as before. 

We close this chapter with a topos-theoretical (or, one might say, topological) 
reinterpretation of the Kochen—Specker Theorem, which to some extent explains 
why the previous construction had to use the fuzzy interval domain IR rather than 
the sharp reals R. To this end, we first generalize the notion of a quasi-linear non- 
contextual hidden variable (cf. Definitions 6.1 and 6.3) to any (unital) C*-algebra: 


Definition 12.28. /. A valuation on a unital C*-algebra A is a unital map 
V:Asa > R (12.211) 


that is dispersion-free (i.e. multiplicative) and linear on commuting operators. 
2. A point in a frame G(X) in some topos T is defined as a frame homomorphism 


p:@(X)>Q, (12.212) 
where Q is the truth object in T. 


If A is commutative, the Gelfand spectrum 2 (A) consists of the valuations on A. The 
second part generalizes the notion of a point of a frame in set theory (cf. §C.11). 


Theorem 12.29. For any unital C*-algebra A, there are canonical bijective corre- 
spondences between: 


e Valuations on A. 
e Points of Z(A) in Sh(@(A)). 
e Continuous cross-sections 6 : @(A) — =4 of the bundle m: 4 + @(A). 
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Proof. We first give the external description of points of a locale Y in a sheaf topos 
Sh(X) (cf. §E.4). The subobject classifier in Sh(X) is the sheaf Q: UH @(U), 
in terms of which a point of Y is a frame map O(Y) > Q. Externally, the point- 
free space defined by the frame Q is given by the identity map idy : X — X, so 
that a point of Y externally correspond to a continuous cross-section o : X —+ Y of 
the bundle 7: Y > X (i.e., 700 = idy). In principle, z and o are by definition 
frame maps in the opposite direction, but in the case at hand, namely X = @(A) and 
Y = 4, the map o : (A) — £4 may be interpreted as a continuous cross-section 
of the projection (12.134) in the usual sense. Being a cross-section simply means 
that o(C) € X(C). As to continuity, by definition of the Alexandrov topology, o is 
continuous iff the following condition is satisfied: 


For all Y € G(Z4) and all C CD, if o(C) € Y, then o(D) € Y. 


Hence, given the definition of @(24), the following condition is sufficient for conti- 
nuity: if C C D, then o(D)\c = o(C). However, this condition is also necessary. To 
explain this, let ppc : Y(D) — X(C) again be the restriction map. This map is con- 
tinuous and open. Suppose Pppc(o(D)) 4 o(C). Since X(D) is Hausdorff, there is 
an open neighbourhood Y%p of Ppe(o(C)) not containing o(D). Let Ye = ppc(%) 
and take any Y € O(E“) such that YN O(E(C)) = YW and YN G(L(D)) = W. 
This is possible, since Y% and Wp satisfy both conditions in the definition of @(4). 
By construction, o(C) € Y but o(D) ¢ Y, so that o is not continuous. Hence o is 
a continuous cross-section of 7 iff 


o(D)\c = o(C) for all C CD. (12.213) 


Now define a map V : Aga > C by V(a) = o(C*(a))(a), where C*(a) is the com- 
mutative unital C*-algebra generated by a. If b* = b and |a,b| =0, then V(a+b) = 
V(a)+V (bd) by (12.213), applied to C*(a) C C*(a,b) as well as to C*(b) C C*(a,b). 
Furthermore, since o(C) € X(C), the map V is dispersion-free. 

Conversely, a valuation V defines a cross-section o by complex linear extension 
of o(C)(a) = V(a), where a € Coa. By the criterion (12.213) this cross-section is 
continuous, since the value V(a) is independent of the choice of C containing a. 


Corollary 12.30. The bundle  : 54 —+ @(A) (cf. Corollary 12.22) admits no con- 
tinuous cross-sections as soon as A has no valuations (e.g. if A = M,(C), n > 2). 


The contrast between the pointlessness of the internal spectrum 2 and the spa- 
tiality of the external spectrum £4 is striking, but easily explained: a point of £4 (in 
the usual sense, but also in the frame-theoretic sense if 4 is sober) necessarily lies 
in some £(C) c ¥4, and hence is defined (and dispersion-free) only in the context 
C. For example, for A = M,,(C), a point V € X(C) corresponds to a map 


V* : 6(Z4) > {0,1}, SHV(S(C)), (12.214) 


where oF) is given by (12.95). Thus V* is only sensitive to the value of S at C. 
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Notes 


Previous advocates of intuitionistic logic for quantum mechanics include Popper 
(1968) and Coecke (2002). The earliest use of topos theory in quantum mechanics 
was probably by Adelman & Corbett (1995), but the founding papers of the topos 
approach to quantum mechanics as further developed in this chapter are Isham & 
Butterfield (1998), Butterfield & Isham (1999, 2002), and Hamilton, Isham & But- 
terfield (2000). This series of papers was predated by Isham (1997) and was fol- 
lowed by Doring & Isham (2008abcd, 2010); see also Flori (2013) for an intro- 
duction. Wolters (2013ab) gives a detailed comparison between the “contravariant”’ 
Butterfield—D6ring—Isham approach and the “covariant” approach in this chapter. 


The original motivation behind our approach to “quantum toposophy” was the 
Principle of General Tovariance (Heunen, Landsman, & Spitters, 2008), which 
was a pun on Einstein’s Principle of General Covariance underlying General Rel- 
ativity (Norton, 1993, 1995). Einstein based his theory of gravity and space-time 
on the mathematical postulate that all equations of physics be invariant under arbi- 
trary coordinate transformation, and similarly we proposed that all physical the- 
ories should be invariant under so-called geometric morphisms between toposes 
and hence should be formulated in terms of what (confusingly) is called geomet- 
ric logic (cf. Mac Lane & Moerdijk, 1992; Johnstone, 2002). Since in fact some 
of our constructions turned out be non-geometric in this sense, we subsequently 
dropped this principle and stopped even referring to the above paper. However, as 
Raynaud (2014) and, more generally, Henry (2015) show, our theory can actually be 
made geometric (in the topos-theoretical sense) provided one puts the entire theory 
of (internal) C*-algebras on a localic (i.e., pointfree) basis, as in Henry (2014ab). 
Other recent developments of the program (which are not discussed here) may be 
found in e.g. van den Berg & Heunen (2012, 2014), Spitters, Vickers, & Wolters 
(2014), Heunen (2014ab), and Heunen & Lindenhovius (2015). 


§12.1. C*-algebras in a topos 

C*-algebras in a topos, including a constructive version of Gelfand duality for 
commutative unital C*-algebras that is valid in arbitrary Grothendieck toposes, were 
first studied by Banaschewski & Mulvey (2000ab, 2006). The topos T(A) and the in- 
ternal commutative C*-algebra A were introduced by Heunen, Landsman, & Spitters 
(2009). All these papers rely crucially on the theory of internal locales in toposes, 
which owes much to Johnstone (1982) and Joyal & Tierney (1984). See also John- 
stone (1983) and Vickers (2007). It is possible to realize T(A) as the topos of sheaves 
on the locale Idl(@(A)), which is the ideal completion of the “mere” poset @(A), 
but we will not use this description (Raynaud, 2014). 


§12.2. The Gelfand spectrum in constructive mathematics 

This section is based on Coquand (2005) and Coquand & Spitters (2005, 2009), 
where also the missing details may be found. All necessary background on lattice 
theory is provided by Johnstone (1982), except the ingredients for the proof that the 
constructive Gelfand spectrum is compact and regular, which is due to Cederquist 
& Coquand (2000). Proposition 12.10 may be found in Aczel (2006). 
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§12.3. Internal Gelfand spectrum and intuitionistic quantum logic 

This section is based on Caspers, Heunen, Landsman, & Spitters (2009), except 
for the final part on Kripke semantics, which is taken from Heunen, Landsman, 
& Spitters (2012). An interesting philosophical analysis of the intuitionistic logic 
emerging from this program may be found in Hermens (2016), to whom the inter- 
pretation elements of the frame @(2“) as disjunctions is due. 


§12.4. Internal Gelfand spectrum for arbitrary C*-algebras 

This section is based on Caspers (2008), Caspers, Heunen, Landsman, & Spit- 
ters (2009), and Heunen, Landsman, & Spitters (2009). Complete proofs of Lemma 
12.15 and Lemma 12.16 may be found in Caspers (2008), §5.2. For different proofs 
of these lemmas see Heunen, Landsman, & Spitters (2009) and Coquand (2005), 
respectively. A proof of Lemma 12.21 may be found in Wolters (2013b), Theorem 
2.17, also available as http: //arxiv.org/pdf/1010.2031v2.pdf. 


§12.5. “Daseinisation” and Kochen-Specker Theorem 

The spectral order was introduced by Olson (1971) and was rediscovered by De 
Groote (2011). For a devastating critique of Heidegger’s philosophy see Philipse 
(1999). The first construction of a “Daseinisation” map was given by Doring & 
Isham (2008b). The version presented here is an improvement, due to Wolters 
(2013ab), of a previous adaptation of the Déring—Isham appraoch to the topos T(A) 
in Heunen, Landsman, & Spitters (2009). Similarly, Theorem 12.29, first published 
in Heunen, Landsman, Spitters, & Wolters (2012), is an improvement due to Wolters 
(2013a) of an earlier result in this direction in Heunen, Landsman, & Spitters (2009). 


The work of Isham & Butterfield (1998), which, as already mentioned, started the 
entire quantum toposophy program, was actually motivated by an topos-theoretica 
reformulation of the Kochen—Specker Theorem. Isham and Butterfield started from 
the following observation. Let @(B(H)) be the poset of commutative von Neumann 
subalgebras of B(H), partially ordered by set-theoretic inclusion, seen as a category 
in the usual way. Consider the presheaf topos [@(H)°?, Set] of contravariant func- 
tors F : @(H) — Set, where Set is the category of sets. The spectral presheaf is 
the contravariant functor Y defined on objects by Xp(C) = X(C), and by the natural 
map on arrows, that is, ©;(C C D) maps @ € X(D) (which is a map D > C) to its 
restriction to C, i.e., to @c € X(C). A point of some object F in [@(B(H))°?, Set] 
is defined as a natural transformation 1 — F, where 1 is the terminal object, i.e., the 
presheaf that maps everything into the singleton set «. 

The Kochen—Specker Theorem 4 la Butterfield & Isham, then, states that if 
dim(H) > 2 as usual, the spectral presheaf has no points. 


Appendix A 
Finite-dimensional Hilbert spaces 


Although we assume the reader to be familiar with linear algebra, some of the points 
below may not be emphasized at that level and hence need to be recalled. 

Unless explicitly stated otherwise, all vector spaces (and hence also all algebras) 
are defined over the complex numbers C. Moreover, from §A.2 until the end of this 
appendix, V will be finite-dimensional; the infinite-dimensional case will be treated 
in the next appendix on functional analysis and general Hilbert spaces. 


A.1 Basic definitions 


Definition A.1. Let V be a vector space (not necessarily finite-dimensional). 


1. A sesquilinear form on V is a map V x V > C, written (v,w) + (v,w) (or 
occasionally, to distinguish it from an inner product, as (v,w) +> B(v,w)) that is 
real-bilinear and satisfies (iv,w) = —i(v,w) and (v,iw) =i(v,w) for all v,w,x €V. 

2. A hermitian form on V is a sesquilinear form that satisfies (w,v) = (v,w). 

3. A pre-inner product on V is a positive hermitian form, i.e., (v,v) > 0. 

4, An inner product on V is, in addition, positive definite: (v,v) = 0 iff v = 0. 

5. Anorm on V is a function ||-||:V — Rt satisfying, for all v,w,h € V anda €C: 


a. ||v-++w]| < ||v|| + ||w|| (triangle inequality); 
b. ||Av|| = |A|||v|| (nomogeneity); 
c. ||v|| = 0 iff v = 0 (positive definiteness). 


Many analytical arguments in functional analysis are based on the fundamental 
Cauchy-Schwarz inequality, which is satisfied by any (pre-) inner product: 


(vw)? < (vv) (ww). (A.1) 
Proposition A.2. An inner product on V defines a norm on V by means of 


IIvll = V(v,¥). (A.2) 
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The Cauchy—Schwarz inequality (A.1) then reads 

I(v,)] < [lvIl wl, (A.3) 
with equality iff v and w are linearly dependent. 


The question arises when a norm comes from an inner product via (A.2). 


Theorem A.3. A norm || - || comes from an inner product through (A.2) iff 
llv+ wll? + [lv — wl]? = 2((vl? + bw). (A.4) 


In that case, one has the polarization identity 


(v,w) = 3(llv-+ wll? — |lv— wll? +illy — iw]? —illy + iwl|?). (A.5) 
Proof. Easy computations show that (A.2) holds, that (w,v) = (v,w), and, with a bit 
more effort, that (v,w1 + w2) = (v,w1) + (v,w2). Now suppose we know that 


(w, sv) = s(w,v) (A.6) 


for certain s € R. Then this property clearly also holds for s~! instead of s. Fur- 
thermore, having (A.6) for s as well as t € R implies the same property also for 
s+t and st. Starting with s =f = 1, this generates (A.6) for each s € Q. Now if 
Sn — 5 for sy, € Q and s € R, then by continuity and homogeneity of the norm, 
(w, Snv) —> (w, sv). Consequently, (A.6) holds for each s € R. Finally, from (A.5) we 
also find (w,iv) = i(w,v), and hence (A.6) holds for each s € C. 


There is an analogous result for continuous hermitian forms, with practically the 
same proof (where continuity is once again needed to pass from Q to R). Let V be 
a vector space with inner product, and let B: V x V > C be a hermitian form. The 
associated quadratic form Q: V — R, defined by 


Q(v) = Biv,v), (A.7) 

then satisfies 
O(zv) = |z|°O(v) (< EC); (A.8) 
O(v-+w) + O(v—w) = 2(O(v) + O(w)). (A.9) 


Proposition A.4. Let V be a vector space with inner product. A map Q:V > R 
that is continuous in the associated norm (A.2) is derived from a hermitian form 
B:HxH -C through (A.7) iff Q satisfies (A.8) - (A.9), in which case 


B(v,w) = 4(O(v+w) — O(v—w) + iO(v— iw) —i0(v+iw)). (A.10) 
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A.2 Functionals and the adjoint 


In the remainder of this appendix, V is a finite-dimensional complex vector space 
with inner product. Since this is automatically a (finite-dimensional) Hilbert space 
(as defined in the next appendix), we rename it as H. The archetypal example is 
H =C", with elements z = (z,..-,Zn), Z; € C”, and standard inner product 


(Z,w) = yi (A.11) 


i=l 


In that case, we hardly make a difference between a linear map a: H — H and the 
corresponding matrix (a;;), where (az); =); aijzj, or, equivalently, 


aij = (v;,aVj;), (A.12) 


where (v; = (1,0,...,0),...Un = (0,...,0,1)) is the standard basis of C”. More 
generally, we will only consider orthonormal bases of Hilbert spaces H, i.e., bases 
(v;) for which (v;,0;) = 6;;. In fact, in the present (finite-dimensional) case, any 
orthonormal set of n = dim(H) vectors is automatically a basis. Throughout this 
book, the word “basis” will be synonymous with orthonormal basis. 

Let H* be the vector space of linear maps f : H — C, also called (linear) func- 
tionals (on H). Since the inner product is positive definite, it is also non-degenerate: 


Proposition A.5. The map yw > fy, where 


fu(®) =(W.®), (A.13) 


is an anti-linear isomorphism H — H* (i.e., one has Aw A fy for any A €C). 


Proof. Injectivity is obvious. For surjectivity, note that coker(f) (i.e., the orthogonal 
complement of the kernel ker(f) of f) is one-dimensional (assuming f is nonzero), 
and take a unit vector % € coker(f). Then y = f(W)W does the job: by linearity of 
f, we have f(o)%— f(%)@ € ker(f) for any @ € A (and even any & € HA), so that 
(W,.£(9)V— f(W)) =0. Since (, W) = ||W||? = 1, this yields f = fy. 

A linear map a: H — H is also called an operator; we denote the algebra of 


all operators on H by B(#). For example, we have B(C”) = M,,(C). Two arbitrary 
vectors Y, @ € H define an operator | y) (@| through Dirac’s “bra-ket” notation 


IW) (Ox = (9.x) Vy. (A.14) 


The adjoint a* of an operator a is defined by the property 


(a“V,) = (Wag), (W.9 CH). (A.15) 


Indeed, for given x (and a), define a functional f,,, : H > C by fay(@) = (%,4@). 
Then, as we just saw, fu,y = fy for some unique y € H; define a* by a* x = y. This 
map is linear by construction. 
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Clearly, one has 
a* =a. (A.16) 


For H = C", the matrix corresponding to the adjoint a* is given by the well-known 
formula a;; = aj. A more abstract example of an adjoint is given by 


ly) (pl = |p) (YI. (A.17) 


The (operator) norm of a: H — H is defined by 
Ila|| = sup{llay||, yw € Af. (A.18) 
where the unit sphere H, C H is defined by 
mh ={weH, ly =1}- (A.19) 
Proposition A.6. One has ||a|| < °° for any linear map a: H > H. 


Proof. Recall that dim(H) = n < co! Map H to C” by the choice of some basis 
(v;). Thus yw € H is mapped to J = (Wy,...,W,) € C", with y; = (v;, y), and we 
have ||||2 = ||w||, where ||z||3 = ¥;;|z;|* is the usual norm on C”, which is given 
by (A.2) with (A.11). This also transfers the operator a: H — H to a linear map 
a: C” — C" defined by the matrix (A.12). Then ||a|] = ||@|| = sup{||@z||2,z € C7}, 
where C} = {z € C”, ||z||z = 1}. Now Gis continuous because it is linear, and hence 
it maps C] (which is compact by Heine—Borel) to some compact set G(C7) in C”. 
It is easy to see that the norm ||- ||2 : C” > Rt is continuous, and according to 
Weierstrass the norm therefore assumes a finite maximum (as well as a minimum) 
on any compact set K. Taking K = G(C7) proves the claim. 


Proposition A.7. Let a,b: H — H be linear maps, and let w € H. Then: 


llaw|| < llallilvils (A.20) 
I|ab|| < |la|| |b ||: (A.21) 
l|a"|| = |lalls (A.22) 
\|a*a|| = ||a||?. (A.23) 


Proof. The first two inequalities are immediate from (A.18). Next, if || y|] = 1, by 
(A.3), (A.15), and (A.20) we have 


lla*wll? = (a* wa" y) = (y,aa*y) <\|yllllaa* yl <|lal|lla* yl], (A24) 


so ||a* || < ||a||, and hence from (A.18), |a*|| < ||a||. But (A.16) gives the opposite 
inequality, whence (A.22). Finally, (A.21) and (A.22) yield ||a*al| < ||a*||||a|| = 
||a||*. From (A.3) and (A.20), on the other hand, we obtain 


llay||? = (ay,aw) = (ya*ay) < |ja*all, (A.25) 


so ||a|| < ||a*a|| by (A.18), and hence (A.23) is proved. 
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A.3 Projections 


The most important examples (and also, as will see shortly, building blocks) of self- 
adjoint operators are projections e : H — H, defined by the property 


=e =e. (A.26) 


Proposition A.8. There is a bijective correspondence e ++ L between: 


e projections eon H; 
e linear subspaces L of H, 


given by 


L=eH; (A.27) 
e= ye: |v;) (vil, (A.28) 


where eH = {ew,w © H} is the image of e, and (v;) is an arbitrary basis of L. 


The proof is routine, including the fact that (A.28) is independent of the basis. 
Whenever convenient, we write (A.28) as e,. For example, the “sub” space L = H 
corresponds to ey = ly, whereas L = {0} corresponds to eso} = 0. 


Define the orthogonal complement!of subset of Hilbert space L* of any subset 
LCH by 
L* ={wed | (y,9) =0V@ €L}. (A.29) 


In particular, if ZL is a linear subspace of H, one easily checks that 

e,1 =1—e;. (A.30) 
Corollary A.9. For each linear subspace L C H one has 

H=Lo0L', (A.31) 


in the sense that LNL+ = {0}, and each vector y € H has a unique decomposition 


weawlty', (A.32) 
where wl € Land wt €L*. 


Proof. Existence of the decomposition is given by 


wl = exy; (A.33) 
id 


yr = (l—ex)y. (A.34) 
Uniqueness follows by assuming y = xl ++ with x! € Land y+ €L+: one then 
has yl -% | — w+ — x", but since the left-hand side is in L and the right-hand side 
is in L+, both sides lie in LA Lt =0. 
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A.4 Spectral theory 


An eigenvector of an operator a is a nonzero element y € H such that 
aw=Awy (A.35) 
for some A € C, called an eigenvalue of a. We also define the eigenspace H, by 
H,={wedH |aw=Ay}, (A.36) 


with associated projection e, (in that H, = e,H, cf. Proposition A.8). In case that 
dim(H,,) = 1 the eigenvalue A is called non-degenerate (or simple). Otherwise it 
is said to be degenerate, with multiplicity mj, = dim(H, ). In linear algebra, the set 
of all eigenvalues of a is called the spectrum of a, denoted by o(a) (for infinite- 
dimensional H, this turns out to be the wrong definition of the spectrum, see §B.14). 

We now give two formulations of the spectral theorem for self-adjoint operators. 


Theorem A.10. Let a be a self-adjoint operator on H. Then o(a) C R, eigenspaces 
for different eigenvectors A # wu are orthogonal (i.e., €,€y = 6, ea), and 


a= ¥ d-e; (A.37) 
A€o(a) 

les Wea: (A.38) 
A€o(a) 


Equivalently, we may reformulate the above spectral resolution of a in terms of the 
existence of a basis (v;) of H consisting of eigenvectors of a. In that case, we have 


dim(H) 

a= Y Ajlv))(vj|; (A.39) 
i=] 
dim(H) 

In= Y |v) (vil, (A.40) 


i=1 
where A; is the eigenvalue corresponding to the eigenvector 0; (i.€., AV; = A;Vj). 


Note that the eigenvalues A occurring in (A.37) are all different, whereas the A; in 
(A.40) need not be: the number of times an eigenvalue A; € o(a) occurs is given 
by its multiplicity. This also implies that the spectral resolution (A.37) - (A.38) is 
canonical (i.e. free of any choices), whereas (A.39) - (A.40) depends on arbitrary 
choices of bases in all subspaces H, with dimension greater than one. Nonetheless, 
it is easier to prove (A.39) - (A.40), which obviously imply (A.37) - (A.38): just 
collect all A; that are equal to A and realize that, as in (A.28), one has 


eg= Y |v) (vj. (A.41) 


ilAp=A 
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More generally, for some (at the moment) arbitrary (but later: measurable) subset 
A CR it turns out to be convenient to introduce the spectral projection e, on H 
and the associated spectral subspace Hy C H: if AN o(a) = we put e, = 0 and 
H, = {0}, and otherwise, 


ea= Yo @; (A.42) 
AEANo(a) 
Ha = eaH. (A.43) 


We now prepare for the proof of Theorem A.10. First, note from (A. 15) that 


2i-Im((y,ay)) = (w,aw) — (way) = (Way) — (W,a"y). (A.44) 
If a* =a, from (A.35) and (A.44) one obtains Im(A) = 0 and hence o(a) CR. 


Lemma A.11. A self-adjoint operator a has an eigenvalue A for which |A| = |lal|. 


Proof. As in the previous proof, the norm || - || assumes a maximum on the compact 
set aH), where H; = {y © H,||w|| = 1}. Suppose this happens at ay;, where by 
construction ||yW|| = 1. By definition of the norm, this maximum must be |{a||, so 
that ||a|| = ||ayi||. Hence, using a* = a, (A.3), and (A.23), we may estimate 


I? = llawi|? = (ayi,ayi) = (vi,a’M1) < lla’ yill S lla" || = |Ial|?. (A.45) 


lla 


Hence we need equality at the < sign in (A.45), which according to the remark 
below (A.3) can only be the case if aw; = |la||?y1. Define x, = aw; — |lal|W1. 
There are two possibilities: if 7; = 0, then ay; = |la|| yi, and 7; 4 0, then 


ay, =a W, — |lallay1 = |lal|? yr — |lallay. = —|la|lx1- (A.46) 


Hence either ay; = ||a|| Wi or ax¥1 = —|la||%1, which proves the claim. 


We are now in a position to prove Theorem A. 10. 


Proof. By Lemma A.11, we already found one eigenvector v, of a, viz. either v1 = 
YW or V1 = %1. Furthermore, it is easy to show that if a self-adjoint operator a leaves 
a linear subspace L C H stable (in that ag € L whenever @ € L), then it also leaves 
L* stable, and remains self-adjoint as an operator a : Lt — L+. First use this with 
L; =C- v;. Lemma A.11, now applied to a: L+ — L*, gives a second eigenvector 
v2. Now take Lp to be the linear span of 0 and v2, and restrict a to Tx etc. Since 
H. is finite-dimensional, this procedure ends after dim(H) steps. 

This leaves us with a basis (v;) of H that by construction entirely consists of 
eigenvectors. The mutual orthogonality of these eigenvectors (and hence of the spec- 
tral projections e,) follows from a simple calculation. 


Corollary A.12. The norm of a self-adjoint operator a is given by 


lla|| = sup{|A|,A € o(a)}. (A.47) 
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Proof. This rapidly follows from Theorem A.10 by expanding y in (A.18) with 
respect to the basis of H given in (A.39) - (A.40). 


Corollary A.13. A self-adjoint operator a is a projection iff: 

e o(a) = {0}, in which case a =0; 

e o(a) = {1}, in which case a = 1; 

e o(a) = {0,1}, in which case a is called a proper projection. 


In particular, if e is a nonzero projection, then 
llel| = 1. (A.48) 


Proof. Only the third case is nontrivial. If a = e is a proper projection, then by 
Corollary A.9 its eigenvectors can only lie in L = eH (with eigenvalue A = 1) or 
in L+ = (1—e)H (with eigenvalue A = 0). The converse implication follows from 
Theorem A.10, notably from (A.37). Eq. (A.48) then follows from (A.47). 


A less elementary but more powerful approach to the spectral theorem is as fol- 
lows. For the notion of a C*-algebra see Definition C.1 in Appendix C. 


Definition A.14. Let a € B(H). Then C*(a) is the C*-algebra generated by a and 
1y (i.e., the algebra of all polynomials in a). 


Theorem A.15. If a is self-adjoint, then C*(a) is commutative, and: 
1. There is an isomorphism of (commutative) C*-algebras 
C(o(a)) =C*(a), (A.49) 


written f ++ f(a), which is unique if it is subject to the following conditions: 


e the unit function 1 6a) : A ++ 1 corresponds to the unit operator 11; 
e the identity function idg(q) : A ++ A is mapped to the given operator a. 


2. In terms of the spectral projections e, of the operator a we have 
C* (a) =C*(e,,A € o(a)) = span(e,,A € o(a)), (A.50) 


where the middle term is the C*-algebra generated by the projections e;. 
3. Under the isomorphism (A.49), 


e, = 6, (a), (A.51) 
where the delta-function 5,1 on O(a) is defined by 541: A dy). 


Proof. For any complex (finite) polynomial p(x) = ¥,, cnx” on R, define an operator 


pla) =Yicna". (A.52) 


Simple computations then show that, for arbitrary polynomials p, and t € C, 
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(tp +q)(a) =tp(a) +4(a); (A.53) 
(pq)(a) = plaja(a); (A.54) 
p(a)* = pla). (A.55) 


Hence the space P*(a) of all such polynomials in a forms a *-algebra of B(H). As 
a linear subspace of the finite-dimensional vector space B(H), P*(a) must itself be 
finite-dimensional, hence it is C*-algebra. Moreover, P* (a) clearly contains 1y (take 
p(x) = 1) as well as a (take p(x) =x), and since P*(a) C C*(a) by definition of the 
latter, we must have P*(a) = C*(a). Since pq = gp and hence p(a)q(a) = q(a)p(a) 
by the above computations, it follows that P*(a) and hence C*(a) is commutative. 
This proves the first claim. 
To establish the isomorphism (A.49), we are going to define a map 


C(o(a)) 3 fo fla) eC*(a). (A.56) 


We initially do this for polynomials f = p, so that f(a) = p(a) is defined by (A.52). 
Since C*(a) = P*(a) consists of polynomials in a, the map (A.56) is evidently sur- 
jective. It is also injective, for suppose p(a) = q(a). Applying this to an eigenvector 
Dy, € Hy yields p(A) = q(A), for each A € o(a), and hence p = q as functions 
on o(a). Hence f +> f(a) is, at least, a bijection of sets. Moreover, the properties 
(A.53) - (A.55) turn it into an isomorphism of C*-algebras, evidently with the prop- 
erties stated after (A.49). Finally, for any given function f : o(a) — C there exists 
some polynomial p that coincides with f on the finite set o(a) C R, so that we may 
define f(a) in (A.56) by p(a), as in (A.52); by the above proof of injectivity, the 
ensuing operator f(a) is independent of the choice of p. 

We prove the last two claims, using the orthogonality property e, ey = d, ye, of 
spectral projections and the defining properties e =e, =e, of general projections, 
see (A.26). From eq. (A.37) in Theorem A.10 we obtain (for polynomials /): 


fa= ¥ f(a)-e. (A.57) 


Reo(a) 


If we now define C*(a)’ as the linear span of the spectral projections e, and 1y 
(which is a unital commutative C*-algebra by the properties of the e, just men- 
tioned), then (A.57) shows that C*(a) C C*(a)’. Conversely, (A.57) gives (A.51), 
which shows that C*(a)’ C C*(a), and hence C*(a) = C*(a). 


A second approach to the final claims of Theorem A.15 is more ambitious, as it 
includes a derivation of Theorem A.10 (instead of assuming it, as we just did). We 
now use (A.51) to define the spectral projections e, ; from (A.54) - (A.55) we have 


ej, = & (ay = dS (a) = 8 (a) =e; 
ey = &(a)* = 6 (a) = 8 (a) =e, 


showing that ey is indeed a projection. Also note the following identities in C(o(a)): 
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y A-&; (A.58) 
A€o(a) 
= Yo &. (A.59) 
A€o(a) 


Transferring these from C(o(a)) to C*(a) via the isomorphism (A.49) then yields 
(A.37) - (A.38). To analyse the projections e, defined by (A.51), we first compute 


€n€u = 9, (a) Ou (4) = (62 8u)(@) = 645, (4) = Sapea; (A.60) 


which shows that the e, are mutually orthogonal. Second, we compute 


ae, VW = ad; (a) W = idea) (a) 6, (a) W = (ido (a) 5, (a)w =A- 5, (a)y =Aggy, 


which shows that e, H C H). Third, (A.60) and (A.59) give ©) c6(a)eaH =H, which 
together with the second step gives e, H = H,. Hence the e, are indeed the spectral 
projections of a. Since we have already proved (A.37) - (A.38), we conclude that 
Theorem A.10 follows from the first part of Theorem A.15. By the argument in the 
main proof above, this first part then also yields the second part. 

The generalization of Theorem A.15 to a family a = (a,...,a,) of commuting 
self-adjoint operators is as follows. 


Definition A.16. Let a = (a1,...,a,) be commuting self-adjoint operators. 


1. A joint eigenvector of a is a nonzero vector W € H such that aw = AW, where 
A = (A1,...,An) with A; € C, ie, for each i = 1,...,n, one has ayy = A; W. We 
call 1. a joint eigenvalue of a. 

2. The joint spectrum o(a1,...,d,) = 0(a) consists of all joint eigenvalues of a. 

3. C*(a) is the smallest unital CH -subalgebra of B(H) that contains each aj. 


Clearly, we have 

o(a) C O(a) X +++ X O(a,) C R”. (A.61) 
Furthermore, since dim(H) < ce, once again C*(a) is just the algebra of complex 
polynomials in all operators aj. 


Theorem A.17. Let a = (a1,...,dn) be commuting self-adjoint operators on H. 
Then the C*-algebra C* (a) generated by these operators is commutative, and: 


1. There is a unique isomorphism of C*-algebras 
C(o(a)) =C*(a), (A.62) 


written f ++ (f(a), subject to the following conditions: 


e the unit function 1g(q) : A. +> 1 corresponds to the unit operator 1H; 
e the eee 1: A+ A; is mapped to aj, for each i = 1,. 
( 


2. In terms of the spectral projections ei) of the operators aj, we have 
1 


C*(a) =C*(e i= 1,...,0,; € (ai). (A.63) 
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3. If for each i. € 6(a) we define the operator 


= oe . ee (A.64) 


then e} is a projection, in terms of which the joint spectrum may be rewritten as 
O(a) ={A € O(a) X +++ xX O(an) |e, FO}. (A.65) 

4. Finally, we have 
C*(a) =C*(eg,4 € O(a)) = span(e,,A € O(a). (A.66) 


We will not prove this in any detail, as the reasoning is quite analogous to the proof 
of Theorem A.15; for example, in (A.56) one just has to replace a by a. The only 
nontrivial point is that since all a; commute, so do all their spectral projections One 
this follows from (A.51), which makes these operators elements of the commuta- 
tive C*-algebra C*(a) (which by definition contains each C*(a;) and, in fact, is just 
the smallest C*-algebra in B(H) with this property). Using (A.38) for each a; and 
multiplying the n versions of the unit 1, thus obtained with each other, yields 


H= @ A. (A.67) 


A€o(a) 


Since p(a)by = p(A) vy, for each joint eigenvector Dy, € Hy, eq. (A.67) gives injec- 
tivity of the map (A.56) (mutatis mutandis) by the same argument as for n = 1. 

This leads to a multi-spectral theorem for the commuting family a, which is most 
conveniently stated in the following form. First, for any polynomial 


k 
P(X1,---;Xn) = y ai svete, (A.68) 
ky ,... Kn 


inn real variables, we generalize (A.52) to 


p(a) = De at vain, (A.69) 
ick 


Theorem A.18. Let a = (a1,...,dn) be commuting self-adjoint operators on H. 
Then for any polynomial p inn real variables, with associated operator (A.69), 


plaj= ¥ pla)-ea, (A.70) 
A€0(a) 


where the spectral projections e, are given by (A.64). 


The special case p(x1,...,X») then recovers (A.67). As for n = 1, eq. (A.70) may be 
generalized to arbitrary continuous functions f(x1,...,%,), either by replacing f by 
a polynomial that coincides with f on the joint spectrum o(a), or by approximating 
f by polynomials on some compact set K containing o(a). 
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Proposition A.19. Let a = (a),...,dn) be a family of commuting self-adjoint oper- 
ators on H. Then there is a self-adjoint operator a € B(H) such that C* (a) =C* (a). 


Proof. Take a = Yiyeo(a) Cae, With all cg different from each other. Then 


C*(a) =C*(e,,4 € o(a)), (A.71) 


by (A.50), and hence the claim follows from (A.66). 


Corollary A.20. Every (unital) commutative C*-algebra C in B(H) is generated by 
a single self-adjoint operator a (and the unit 1), i.e., C = C*(a). 


Proof. Just take a basis (cx) of C as a vector space and decompose cx = ax + ia’, 
with a, and a’, self-adjoint (namely, ay = 4(cy+c;) and a, = —4i(cx — c{)). If C is 
to be commutative, each cy must be normal, i.e., cpcy = cyc;, which is equivalent 
to commutativity of a, and a and all c, must commute, i.e., all a, and aj, must 
commute for different k. Hence C = C*(a,,a/,), which is of the form C*(a) for an 
appropriate family a, and so by Proposition A.19 it takes the form C*(a). 


We say that a unital commutative C*-algebra C C B(H) is maximal if it is not 
contained in some bigger unital commutative C*-algebra in B(H). Also, we call a 
self-adjoint operator a maximal iff o(a) has cardinality dim(#), or, in other words, 
if each eigenvalue of a is nondegenerate. In finite dimension it is easy to classify 
maximal unital commutative C*-algebras in B(H) up to unitary equivalence. 

Here we say (as usual) that a linear map u: H > H’ is unitary when it is invertible 
and satisfies (up,uy)’ = (g,y) for each 9, y € H (note that the inverse u! is 
automatically linear). Two *-algebras C C B(H) and C’ C B(H’) are called unitarily 
equivalent, then, if there is a unitary map u: H — H’ such that C’ = uCu!. 


Theorem A.21. A unital commutative C*-algebra C C B(H) is maximal iff it is uni- 
tarily equivalent to the algebra D,(C) of all diagonal matrices on H' = C". 


Proof. First, D,(C) is indeed maximal abelian in M,(C); any extension of D,(C) 
would have to contain some additional matrix b € M,,(C) that commutes with all 
a € D,(C), but by elementary linear algebra this very property implies b € D,(C). 
By Corollary A.20, we have C = C*(a), where a* =a. Then C is maximal iff a 
is maximal. For if not, some eigenvalue A’ € o(a) would have multiplicity mj > 1, 
and hence the corresponding spectral projection ej, could be decomposed as ey = 
pe aig?) 
MM M 
extend C*(a), as in (A.50), to C* (e,,e6) ,e0),a € o(a),A £4’), which remains 
commutative, and we have a contradiction with the alleged maximality of C*(a). 
Thus a is maximal, in which case we list the spectrum as o(a) = {A,...,An}, 
with corresponding eigenvectors {U,,,...,U,,}- This gives rise to a unitary map 
u:H —C" defined by wv,, = v;, where (U1,..., U;) is the standard basis of C”, and 
clearly uau~! = diag(A,,...,An). If (as is the case) all entries A; € R are different, 
any (z1,---,Zn) € C” may be written as z; = p(A;), i=1,...,n, where p is some 
complex polynomial p(x) = Yj cix", x € R, c; € C. Hence uC*(a)u~! = D,(C). 


, where both terms are orthogonal and hence commute. We could then 
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A.5 Positive operators and the trace 


Operators a: H — H satisfying one (and hence all) of the conditions in the next 
proposition are called positive, written a > 0 or 0 < a. More generally, we write a < 
b iff b—a > 0. Positive operators play a very important role in quantum mechanics. 


Proposition A.22. The following conditions on an operator a are equivalent: 


1. (w,aw) > 0 for arbitrary y € H. 

2. a* =aand o(a) CRt. 

3. a=? for some self-adjoint operator c. 
4. a=b*b for some operator b. 


Proof. 1 — 2: Putting (w,ay) > 0 in (A.44) gives (w,aw) = (w,a*y) for all w. 
But for any operator b and vectors 7,@ € H, as in (A.10) we have the identity 


A(x, b—) = (X+9,0(X+ P)) —(X— 9, D(X — @)) 
+ i(x —i9,b(% —i—)) —i(X +i, D(X +i¢)). (A.72) 


So b = 0 iff (w,bw) = 0 for all y € H, and hence condition | implies a* = a. We 
therefore know that o(a) C R, and since an eigenvalue A < 0 would contradict the 
first condition 1, the second condition follows. 

2 — 3: define c = ,/a, where (since A; > 0) the square root is (well) defined by 


dim(H 


) 
va= Y VAilv,) (vi. (A.73) 
i=1 


3 — 4 is trivial (take b = c), as is 4 — 1, since (w,ay) = ||by||?. 


Combining this with Proposition A.5, we obtain the following result. 


Proposition A.23. The relationship (9, y)' = (9, aw) gives a bijective correspon- 
dence between (hermitian/positive) sesquilinear forms (-,-\' on H and (hermi- 
tian/positive) operators a on H. 


Proof. One direction is trivial. For the other, fix 7 € H and define a functional 
f(W) = (x, y)'. By Proposition A.5, f = fy for some unique @ € H. Define an 
operator b: H + H by by = Wand puta =D". 


Proposition A.24. Any self-adjoint operator a € B(H) has a decomposition 
a=a,-a_, (A.74) 


where ax > 0. These are unique if they also satisfy ay.a_ =a_a, =0. 


Proof. Using Theorem A.10, we may take 


ape Shey (A.75) 


A€0(a)NR* 
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Equivalently, we may use Theorem A.15 to rewrite (A.75) as 
az = (jido(a)| -Ip+)(a) = f(a), (A.76) 


where |idg,,)| is the function A ++ |A|, Rt = [0,c¢) and R~ = (—c»,0). To prove 
uniqueness, we note that since o(a) C R is finite, there is a polynomial p such that 
ff, =p, and hence a, = p(a). Ifa =a', —a_ with al, > 0 and a'.a_ =a'_a', =0, 
then for any polynomial p we have p(a) = p(a‘,) + p(—a'_). For the one just taken, 
this gives p(a) = a’, by positivity of the a!,, and hence a’, = a4, etc. 


We now introduce a construction of great significance to quantum mechanics. 


Lemma A.25. If (v;) and (v/) are bases of H, then for any operator a: H + H, 


\\(vj,av;) =) (vj,avj). 


l U 


Proof. A simple computational proof uses the identity (A.40) for any basis (v;) (i.e., 
the v; need not be eigenvectors of a, as in (A.39)). Then, as in physics books, 


Vi (v},av;) = VF (vg, vi) (v}, vj) (Dj, az) = V (Ve, vj) (Dj,ad") = V°(v;,a0;). 


i isk ink 


This lemma allows us to define the trace of a by 


Tr(a) =)" (vj,a0;), (A.77) 


U 


where (v;) is any basis of H. By almost the same proof as Lemma A.25 we obtain 


Tr (ab) = )"(v;,av;)(v;,bv;) =) (v;,bv;)(v;,a0;) = Tr (ba). (A.78) 


ij ij 


If u is unitary (in that uu* = u*u = 1,) then from either Lemma A.25 or eq. (A.78), 
Tr (uau*) = Tr(a). (A.79) 
Finally, if a* = a, then (A.37) and taking the trace over the basis in (A.39) yields 


Tr(a) = y my: A. (A.80) 
Aco(a) 


Definition A.26. A density operator is a positive operator p on H such that 
Tr(p) =1. (A.81) 


The analysis of density operators hinges on the introduction of a second operator 

norm, beside the canonical one (A.18). In finite dimension these norms are equiva- 

lent, but in general they are not, and it makes sense to introduce both already here. 
For any a € B(H), the operator a*a is positive and hence self-adjoint, so that 
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n 


aa= Yo pen =¥ pilv;)(vj| (A.82) 


peo (a*a) i=1 


for certain eigenvalues 1; > 0 (including possible multiplicities) or u € o(a*a) (ex- 
cluding multiplicities), all necessarily non-negative by positivity of a*a, and some 
normalized eigenvectors 0; or spectral projections e,,; cf. (A.37) - (A.39). Then put 


lai= Yo Jem = Vee (A.83) 
bEo(a*a) i=1 


It is not immediately clear that || - ||; is a norm on B(H), but we will shortly prove 
that it is; we provisionally refer to B(H), equipped with the norm (A.83), as Bi (H). 
Another way to defined this trace-norm is to first introduce the absolute value 


la| = Vata (A.84) 
of any operator a € B(H), where the square root is simply defined as 
n 
Vata= y Jeu = S J i|vi) (vil, (A.85) 
[Leo (a*a) i=1 


which coincides with f(a*a) for f(x) = ./x as defined in Theorem A.15, see (A.57). 
If a is positive, then |a| = a. Some other useful properties of the absolute value are 


ker |a| = ker a = (ran|a|)*; (A.86) 
IIlaly| = llawll, we #. (A.87) 


For the first equality in (A.86), 


ay =0> aay =08 Vatay =08 |aly =0, 


but also a*aw =0 => (yw,a*ay) =0 = ||aw||* =0 = aw = 0. For the second, 
ker a = (rana*)*, (A.88) 


which in turn is immediate from the definition of the adjoint. Eq. (A.87) is similar. 
Though once again lacking transparency as a norm, by construction we now have 


llal|1 = Tr (Jal), (A.89) 


so if (A;) are the (positive) eigenvalues of |a|, including multiplicities, then 
n 
lla = Yo An. (A.90) 
i=1 


To obtain suitable estimates for the trace norm we need some further techniques. 
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Definition A.27. Let H be a finite-dimensional Hilbert space. 


1. A partial isometry is an operator u € B(H) for which u*u = e is a projection. 
2. A unitary is an invertible partial isometry. 


For immediate and later reference, we collect some properties of such operators. 
Lemma A.28. Let H be a Hilbert space with a partial isometry u € B(H). 


Also u* is a partial isometry, or, equivalently, uu* = f is a projection. 

The kernel of u is (pH)*, and its range is fH. 

The given partial isometry u is unitary from eH to fH. 

Conversely, an operator v on H for which there is a (closed) subspace L C H on 
which v is isometric, whilst it is identically zero on L+, is a partial isometry. 

Ifu £0, then ||u|| = 1. 

e An partial isometry u is unitary iff usu = uu* = 14 (i.e. e = f = 14). 


The proof is an easy verification. In the infinite-dimensional case, a distinction arises 
between isometries (i.e, injective partial isometries, so that u*u = 1) and unitaries, 
but if dim(H) < ©, injectivity implies subjectivity and hence bijectivity. 

We now come to von Neumann’s highly convenient polar decomposition of an 
operator, which mimics the polar decomposition z = rexp(i@) of z € C. 


Proposition A.29. For a € B(H), assumed nonzero, the operator u given by 
ulalw = ay, (|a|y € ran |al); (A.91) 
uy =0, (yw (ran|a|)+ =ker|al) : (A.92) 


1. Is well defined; 

2. Is a partial isometry (and hence has norm |\u|| = 1); 

3. Is unitary from ran |a| to rana (if dim(H) = °%, take closures (ran|a|)~, (rana)~ ); 
4. Satisfies 


IIlaly|| = lawl; (A.93) 
u*ula| = |a| = |alu*u. (A.94) 


Given that u is a partial isometry, it is characterized by the two properties: 


ker u = ker a; (A.95) 
a =ulal. (A.96) 
Furthermore, if a 4 0, then a is invertible iff u is unitary. 


Proof. This follows from (A.86) - (A.87), except the claim that (A.95) - (A.96) 
uniquely define u, which we will not use and whose proof we therefore omit. 


Recall from the easy Theorem 2.7 that there is a bijective correspondence be- 
tween linear maps @ : B(H) — C and operators p € Bi (H), given by (2.33), ie., 


(a) = Tr(pa). (A.97) 
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Proposition A.30. If H is finite-dimensional, the map @ > p from B(H)* to B\(H), 
defined by (A.97) gives an isometric isomorphism of Banach spaces 


B(H)* © Bi (H); (A.98) 


in particular, one has 


||| = le lh. (A.99) 

Proof. Bijectivity being known already, the basic estimate towards (A.99) is 
|Tr(pa)| < |lpl|illall. (A.100) 
This follows from the polar decomposition p = u|p| and the spectral decomposition 


m<n 


lp| = ¥ pilv;) (vi), (A.101) 
i=1 


where p; > 0 (but not necessarily ); pj = 1). Assuming p £0, using (A.101), (A.78), 
Cauchy—Schwarz, (A.20), (A.21), ||z|| = ||v;|| = 1, and (A.90), we indeed have 


[Tr(pa)| = |Tr(u|p|a)| = |Tr (|p |au)| =| 7 pi(v;,auv;)| (A.102) 


< Yi pil(vi,auvj)| < }’ pillal|||el|||vil] = |lpllillall.  (A-103) 


To prove saturation of this bound, take a = u*, which is isometric on the space 
ran|p| = span(01,..., Um) and hence satisfies ||a|| = 1 as well as (v;,auwv;)=1. Con- 
sequently, from (A.102) we find |Tr(pa)| = ¥; p;. By (A.90) for p instead of a, i.e., 
lle ll1 = Tr (P|) =X; pi, we obtain |Tr (pa)| = ||p||1, which yields (A.99). 


Corollary A.31. The trace-norm || - ||, is (indeed) a norm on B,(H). 


As explained in more detail in §B.9, for any vector space V with norm, with double 
dual V**, we have a canonical map V + V™ given by v++ ¥, where 


(0) = O(v), (A.104) 


where v EV, *€V**, and @ € V*. By the general theory, this map is always isometric 
(and hence injective), and if V is finite-dimensional, it is also surjective and hence 
an isomorphism. Therefore, taking V = B(H), we infer from (A.98) that 


B(H)* ~B(H), (A.105) 
where a € B(H) corresponds to 4 € B,(H)* by means of 
a(p) =Tr(pa). (A.106) 


This new role of B(H) as the dual of B\(H) also equips it with a new topology 
(besides the norm topology it already has), viz. the accompanying w*-topology. 
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This topology is defined by saying that a, — a iff G,(p) — G(p) for each p € 
B,(H). For historical reasons this is called the o-weak topology on B(H), so we say 
that a, > a o-weakly in B(#) iff Tr(pa,) + Tr (pa) for each p € Bi (HA). 


To close, it is interesting to ut the trace-norm into a classical perspective. As ex- 
plained in Chapter 1, at least on finite-dimensional Hilbert spaces, density operators 
are the quantum counterparts of probability measures (or distributions). If X is a 
finite set, the associated function space C(X) carries the supremum-norm 


I|fllo = sup{|f(x)|,x€ X}, (A.107) 


cf. (1.24). We equip the space C(X )* of all linear maps @ : C(X) > C with the norm 


|| || = sup{]o(f)|, f € C(X), Il flle = 1}- (A.108) 


Let L!(X) be the vector space of all functions p : X + C, equipped with the norm 


lolli = YS |p @)I. (A.109) 


xEX 


As in the quantum case just discussed, even for finite X it is not immediate that this 
expression indeed defines a norm; this follows from the next proposition. 
Each p € L!(X) defines a linear map @ : C(X) — C by 


o(f) = )° p(x)f(x). (A.110) 


xeX 
Conversely, each w € C(X)* defines an element p € L!(X) by 
p(x) = @(6,), (A111) 
with 6, € C(X) defined by 6,(y) = 6,y as usual. 


Proposition A.32. If X is finite, the map @ +> p from C(X)* to L'(X), defined by 
(A.111), has inverse (A.110) and gives an isometric isomorphism 


C(X)* = L(x) (A.112) 
of Banach spaces; in particular, one has 
||| = [lp ll. (A.113) 


Proof. The vector space isomorphism in question can be checked effortlessly. To 
verify (A.113), note that trivially |@(f)| < ||p|l1|| flo, whence ||@|| < ||p||1. To 
show saturation of this bound, given p € L!(X) take f(x) = |p(x)|/p(x) if p(x) 40 
and f(x) = 0 elsewhere; if p 4 0 this gives ||f||.. = 1 and |@(f)| = ||p||1. 
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Notes 


The material in this appendix has been collected from numerous functional analysis 
books (some of which are mentioned in the Notes to the next appendix), adapted to 
the finite-dimensional case. Though not used in preparing this text, Halmos (1958, 
1970) are classics. Theorem A.3 is due to Jordan & von Neumann (1935); Amir 
(1986) contains many other characterizations of inner product spaces. 


Appendix B 
Basic functional analysis 


This appendix contains all technical information on general Hilbert spaces (as op- 
posed to the finite-dimensional ones of the previous appendix) and, more generally, 
infinite-dimensional Banach spaces, that is either directly needed in the main text, 
or forms necessary preparation for the next appendix on operator algebras (which in 
turn play a central role in this book). Since most interesting examples of both Hilbert 
spaces and more general Banach spaces require some measure theory, which at the 
same time provides the mathematical foundation of probability theory, we include 
a brief introductory overview to this area as well (restricted, though, to the case we 
need, viz. measures and integrals on locally compact spaces). 

Functional analysis has its roots in both mathematics and physics. In particular, 
the general area of spectral theory, which emerged during the period 1900-1930 
in the hands of Hilbert and his school, largely owes its existence to mathematical 
physics, as well as to Hilbert’s genius in finding the right combination of examples 
and abstract theory (including his innovative definition of the spectrum). Hilbert’s 
school culminated in the books Methoden der mathematischen Physik by Courant 
and Hilbert (1924), Gruppentheorie und Quantenmechanik by Weyl (1928), and 
Mathematische Grundlagen der Quantenmechanik by von Neumann (1932), all of 
whom were at Gottingen at the time (as were such giants in the history of quan- 
tum mechanics like Born, Heisenberg, and Jordan). Whereas Courant & Hilbert 
at least thought they described classical physics (although it soon turned out that 
their discussion of eigenvalue problems paved the way for the Schrédinger equation 
discovered two years later), von Neumann explicitly developed the Hilbert space 
formalism in order to describe quantum physics (for example, the modern abstract 
definition of a Hilbert space was his), as did Weyl (in connection with group theory). 

What seems to have come from pure mathematics, though, is the idea, central to 
functional analysis, of looking at functions as points in some (infinite-dimensional) 
vector space. This emerged from the French school of Hadamard and his student 
Fréchet, requiring considerable interaction between the (then) new fields of linear 
algebra and topology. Eventually, this also led to the fundamental work of Banach. 

We hope that the combination of logical setup, examples, theorems, and proofs in 
this appendix helps convince the reader of the sober elegance of functional analysis. 


© The Author(s) 2017 S15 
K. Landsman, Foundations of Quantum Theory, 
Fundamental Theories of Physics 188, DOI 10.1007/978-3-3 19-5 1777-3 
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B.1 Completeness 


A notable difference between finite-dimensional vector spaces with norm and 
infinite-dimensional ones is that the former are always complete in a sense to be 
defined now, whereas the latter may or may not be. This distinction has major con- 
sequences, especially where idealizations (and hence limits) are concerned. 

As before, all vector spaces are defined over C (unless stated otherwise). 


Definition B.1. Let V be a vector space (or, more generally, a set). 
A metric on V is a functiond:V x V + Rt? satisfying, for all f,g,h EV: 


1. d(f,g) <d(f,h) +d(h,g) (triangle inequality); 


2. d(f,g) =d(g,f) for all f,g € V (symmetry); 
3. d(f,g) =0 iff f = g (positive definiteness). 


Our main example is a vector space V with norm || - ||, which, as an easy exercise 
shows, gives rise to a metric on V via 
d(f,8) =||f—sll- (B.1) 


In particular, an inner product on V induces a metric on V through (A.2) and (B.1). 
The reader should have some experience with metric spaces from an undergrad- 
uate Analysis course, but for convenience we repeat the definition of completeness. 


Definition B.2. 7. Let (vn) = {Vn }nen be a sequence in a metric space (V,d). 

We say that v, — v for some v € V when limy-400d (Vn, Vv) = 0, or, more precisely: 
for any & > 0 there is N EN such that d(vn,v) < € for alln > N. In a normed 
space, this means that Vy > v iff limps. || Vn — v|| = 0. 

2. A sequence (v,) in (V,d) is called a Cauchy sequence when d(v,,Vm) + 0 when 
n,m —> ©, or, more precisely: for any € > 0 there is N € N such that d(vn,Vm) < 
€ for all n,m > N. In a normed space, this means that (vj) is Cauchy when 
vn —Vm|| + 0 for n,m — ©, in other words, when limy moo ||Vn — Vm|| = 0. 

3. A metric space (V,d) is called complete when every Cauchy sequence in V con- 
verges (i.e., to an element of V ). 


A convergent sequence is Cauchy: from the triangle inequality and symmetry one 
has d(vn,V¥m) < d(va,v) +d(vm,v), so for given € > 0 there is N € N such that 
d(vn,v) < €/2, et cetera. However, the converse statement does not hold in general: 
for example, take the vector space ¢.(N) of all functions f : N > C that are zero 
expect at finitely many places (with the obvious pointwise operations), or, equiva- 
lently, the vector space C® of all sequences (x,,) with finitely many nonzero entries. 
This vector space is incomplete in any conceivable norm, like the sup-norm 


Il fllo = sup{|f(x)|,« € N}. (B.2) 


Indeed, the sequence (f,,), where f,(x) =1/x forx=1,...,n and f(x) =0 forx >n, 
which corresponds to the sequence (1,1/2,1/3,...,1/n,0,0,...) in C® is Cauchy, 
but its obvious limit f(x) = 1/x for each x € N, or x, = 1/n, does not lie in &.(N). 
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Definition B.3. e A Banach space is a vector space with norm that is complete in 
the associated metric (B.1). 

e A Hilbert space is vector space with inner product that is complete in the associ- 
ated metric (B.1), in which the norm is defined by (A.2). Equivalently, a Hilbert 
space is a Banach space whose norm comes from an inner product via (A.2). 


As we have seen, £.(N) fails to be a Banach space in the sup-norm, but (its comple- 
tion) £°(N), which consists of all bounded functions f : N > C, is (see §B.2). 


Definition B.4. Two norms || -|| and ||-||' on the same vector space V are equivalent 
if there are constants M > 0 and m > 0 such that for any v € V, 


m||v |’ < |lv|| <M]lvl’. (B.3) 


In that case, the two metric topologies on X defined by these norms coincide, so that 
in particular completeness and convergence in || - || and || - ||’ are the same. 


Proposition B.5. Let V be a finite-dimensional vector space. All norms on V are 
equivalent, and hence V is complete in any norm. 


Proof. We derive this from a basic fact of Analysis, namely that C” is complete in 
the (Euclidean) norm || - ||2 derived from the standard inner product (A.11), that is, 


n 
lIzil3 = ¥° |zil?. (B.4) 
i=l 


So the first step is to transfer the problem from V to C”, where n = dim(V), by 
choosing a basis (v;) of V, and mapping v; to the standard basis vector u; of C”. 


Linear extension then maps v = )';z;v; € V to z = (z1,.--,Zn) € C”, which gives an 
isomorphism V — C”. This maps endows C” with a new norm ||z|| = ||v|| (i.e. the 
given norm on V), which we now prove to be equivalent to || - ||2 = || - ||/'. The second 


inequality in (B.3) easily follows from Cauchy—Schwarz, viz. 


Hell = Yzvull < Vella) < Solel? (Soles? = Ml. 
1 1 l J 


This inequality, together with the elementary but extremely useful estimate 
IIIvll — will < Ilv—wIl, (B.5) 


which is valid for any norm in any dimension, implies that the function || - ||: C” > R 
is continuous with respect to the Euclidean metric on C”. Now the unit ball C7 = 
{x € C” | ||x||2 = 1} in C” is compact, so according to Weierstrass, the norm || - | 
assumes a minimum on C7. Hence there exists 42 € C7 such that |||] < ||z|| for all 
z € Cj. For arbitrary nonzero z € C", the rescaled vector z’ = z/|{z||2 lies in C7, so 
|| || < ||z’||, which is nothing but the first inequality in (B.3) with m = |||]. 
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B.2 ¢? spaces 


The simplest examples of infinite-dimensional Banach spaces are the ¢?-spaces, 
where | < p < ~ (for p < | the Minkowski inequality (B.14) below goes in the 
wrong direction, so that, by failure of the triangle inequality, eq. (B.8) below fails to 
define a norm). Such spaces are defined on some set X, hence we write £?(X). 

If X = {x1,...,%p} is finite, with cardinality n = |X|, then ¢?(X) consist of all 
function f : X — C with pointwise operations, so that ¢?(X) ~ C” as vector spaces 
through the map f +> (f(x1),...,f(%n)), where C” is equipped with a specific (and, 
for p ~ 2, unusual) norm. However, by Proposition (B.5) we may as well take p = 2 
and nothing has been gained compared with the linear algebra of Appendix A. 

Therefore, life starts with infinite sets X, and we begin with the simplest of those, 
viz. X = N (but to avoid unnecessary duplication with regard to later generalization, 
although for the moment we assume X = N, we still write X for the underlying set). 
We define ¢? = ¢?(X) as the set of functions f : X > C that satisfy 


YIf@yP <2 < p<); (B.6) 
xEX 
sup |f(x)| < % (p=). (B.7) 
xEX 


As will be shown in far greater generality (cf. Theorem B.9), the point is that for any 
1 < p <©, the set €?(X) thus defined is not merely a vector space (under pointwise 
operations); it is even a Banach space in the norm 


1/p 
Ilfllp = (x io") (l<p<~); (B.8) 


xEX 


\|f lloo = sup{|f(x)|,x € X} =inf{C > 0| | f(x)| <CVx € X}. (B.9) 
The case p = 2 is unique in that ¢*(X) is also a Hilbert space in the inner product 


= ¥ f(x)g(x). (B.10) 


xEX 


As we now outline, these expressions may be generalized to any set, to which 
end we should define the meaning of (possibly uncountable) sums )’,-,. Although 
the generality below will only be used in §B.12, it is convenient (at little extra cost) 
to cover more general codomains for f than just the complex numbers C. 


Definition B.6. Let X be a set, V a normed vector space, f :X — V some function, 
and v © V. The sentence Y,cx f(x) = v means that for each € > 0 there is a finite 
subset F CX such that for each finite subset G C X with F C G, we have 


Il Df) -vll <e. 


xEG 
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In terms of nets, this means that the net s = (sr)rc¢ P(X) in V indexed by finite 
subsets F C X (ordered by inclusion), where sf (x) = Vxer f(x), converges to v. 
For X = N and V = C we may take F to be {1,...,N} and G to be {1,...,n}, 
where n > N, in which case we recover the usual notion of convergence of sums (i.e. 
Ve >04N EC NVn > N:|Y"_, f(x) —v| < €). However, since also more general F 
and G are allowed, Definition B.6 is in fact equivalent to absolute convergence: 


Lemma B.7. Let X be a set and let f : X — C be some function. 


1. There exists z © C such that yey f(x) =z iff Vex |f(x)| < 2. 
2. If f (x) > 0 for each x € X, then, in the sense of Definition B.6, 


Fs) wf F seo. exami, (B.11) 


xEX xeF 


which is true even if the supremum on the right-hand side is infinite (in which 
case the left-hand side simply does not converge). 


Therefore, for f : X + C, one may use (B.11) to check if Yc | f(x)| < °°, in which 
case it makes sense to try and find the value v of ).<x f(x) as in Definition B.6. 


Proof. 1. We write f = fi; +ifo, with f; : X — R, and for given G C X, write Giz = 
{x€G|+f;(x) > 0} (the ambiguity at those x where f(x) = 0 is irrelevant). Then 


IM f@s VIF@1s< LIA@M+ Vip)! 


xE€G xEG xEG xEG 
= ¥ fA@—- ¥ AW+ ¥ Aw®- LY A& 
xXEG 4 xEG)_ xEGo4 xXEG)_ 
< 40 Y f(x)|,a@ € {14+,1-,2+,2 ). (B.12) 
xEGq 


Using Proposition B.8 below, the first inequality in (B.12) shows that absolute 
convergence implies convergence in the sense of Cauchy, whereas the last in- 
equality (i.e., Y,<g|f(x)| < 4sup---) shows the converse. 

2. We pick € > 0 and abbreviate the right-hand side of (B.11) as o. By definition 
of the supremum (which we assume finite) there is a finite F C X for which 
0 >Yier f(x) > o —€. Since the terms are positive, for any finite GD F we 
have Veg f(x) > Lrer f (x) and hence also 6 > Yyeg f(x) > 6 —€, from which 
|Viveg f(x) — 0| < €. Hence Y.cy f(x) = o by Definition B.6. 
The same argument works if o = ©, in which case for any 0 < M < o there is a 
finite F C X for which ) <p f(x) > M, and hence certainly Y .<g f(x) > M. 


Leaving its proof to the reader, we state the Cauchy condition for convergence: 


Proposition B.8. We have Y..<x f(x) = v for some (necessarily unique) v € V, in the 
sense of Definition B.6, iff for each € > 0 there is a finite subset F C X such that for 
each finite subset G' C X\F we have ||Yxeq f(x)|| < €. 
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For uncountable set X, Definition B.6 is not as bad as it may sound, since when- 
ever )yex |f(x)| < c0, only a countable number of terms can be nonzero (proof by 
contradiction: if not, there must be an n € N for which infinitely many x satisfy 
| f(x)| > 1/n (nested proof by contradiction: if not, then for all n, only finitely many 
x satisfy |f(x)| > 1/n, and hence, a countable union of finite sets remaining count- 
able, only a countable number of x can have f(x) #0), so the sum of |f(x)| over 
those x alone already diverges). In particular, for X = N the sum in (B.6) has its 
usual meaning. However, even for X = N, the sums just defined only have their 
usual meaning if the series in question is absolutely convergent (the standard coun- 
terexample of a real series )°,,x, that is convergent but not absolutely convergent is 
given by x, = (—1)"/n; in the above light, taking G = F UE, where E is a large but 
finite set of even numbers, then makes | Veg xn —x| as big as you do not like). 

Using the triangle inequality for the norm and the Cauchy criterion for conver- 
gence, it is easy to show that if V is a Banach space and )y< || f(x)|| < 0, then 
the sum )’.cx f(x) exists in V (i.e., it equals some v € V in the sense of Defini- 
tion B.6). The implication is one-sided, though: the latter sum may exist even if the 
former does not. For example, take V = (7(N), pick some f € ¢7(N), and define 
f:N— @(N) by f(x) = F(x), where 6,(y) = 6, (and hence ||5,||2 = 1). Then 


Y Wf@ll2= - LF(@)| = Illi. 


xEN 


Now Yyen f(x) = f exists per assumption that f € ¢7(N) and hence ||f||2 < ©, 
which is implied by, but is not equivalent to || f||; <0. See also §B.12 below. 

In any case, the meaning of the possibly uncountable sums in (B.6) and (B.8) 
should be clear now, as only finite sums (B.11) are involved; for (B.10), by Hélder’s 
inequality (B.15) below for p = g = 2, the sum in question is absolutely convergent, 
and hence it falls within the scope of Definition B.6 and Lemma B.7. 


Theorem B.9. For any 1 < p < ©, the set £?(X) is a vector space under pointwise 
operations. Moreover, €?(X) is a Banach space in the norm (B.8) - (B.9). 


Proof. 1. €? is a vector space. The case p = © is obvious. For 1 < p < ©, use the 
convexity of the function tf + ft? for t € [0,°°). For convex functions one has 
fi +) < HF) + fe), 80 that (1(1 +19)? < 4(¢? +22). Combined 
with monotonicity of the function t +> ¢? on [0,c°), i.e. 5 <t => s? <t?, this gives 


Ife) +a)? < (IF@I+Ie@)l)? <2? (FP +Ie@)l?), —- (B-13) 


so that summing over x gives || f + g||5 < 2?! (| F\|5 + Ilgllb) < 0°. 
Hence if f € @? and g € @?, then f+ge @?. 

2. ||-||p is a norm on £?. The case p = © is, once again, obvious. For 1 < p < ©, 
the only nontrivial part is the triangle inequality 


If +8llo < IlFllo + llsllp, (B.14) 


called the Minkowski inequality. This follows from Hélder’s inequality: 
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IIfglli < IF llp Iislla (B.15) 


which is valid for f € @? and g € €7, where 1 < p< and 1 <q < @ satisfy 


2 + : =i). (B.16) 
Pq 
Thus one has g = p/(p—1) for 1 < p<», or g= for p=1, or g = 1 for 
p =~. One calls p and q conjugate exponents (so that p = 2 is self-conjugate). 
3. £? is complete in the norm || - ||». We must prove that some Cauchy sequence (f;) 
in €? converges. This takes three steps, which we first prove for 1 < p < -. 


a. Find a candidate f for the limit. Since (fx) is Cauchy, for each € > 0 there 
exists K € N such that || f; — fi||p < € for all k,/ > K, or 


Ife — fullS = py | fe(x) — filx)|? <e?. (B.17) 


Hence | fi(x) — fi(x)|? < e€? for all x, so (f;(x))x is a Cauchy sequence in C. 
Since C is complete, (f;(x)), converges, hence we may define f : X — C by 


f(x) = lim fx (x). (B.18) 
k-00 
b. Show that f € £?. Note that 
lig = sup )° |g@)/?, (B.19) 
F xeF 


where the supremum is over all finite subsets F C X. For fixed F we have 


Y lfc) — fila)? < e?. 


xe€F 


Since the sum is finite, we may take lim;_,.., giving Yyer | f(x) — f(x)? < €?. 
By (B.19), the sup over all finite F yields: Ve > 04K € N such that V/ > K, 
we have || f — fi||> < €”. For fixed € and J, this says that f — f; € 0’, so f € 0”, 
because f = (f — fi) +f; with fj € @?, and we know that £? is a vector space. 
c. Show that f, — f in €?. This is contained in the previous step, since we had 


Ve > 045K ENVisx: | f —fillp < €. (B.20) 


But this is the same as limy-,.. ||,f — fi || p =0, or fr > f in 2?. 


The proof for p = ~ is virtually the same, with (B.19) replaced by 
Ilg|leo = sup sup{|g(x)|}- (B.21) 
FCX xeF 


Within the finite supremum sup,¢r |fx(x) — fi(x)| < €, we may take the limit 
k — © once again, followed by a supremum over F C X. 


522 B Basic functional analysis 


B.3 Banach spaces of continuous functions 


Further Banach spaces that can be defined without measure theory come from topol- 
ogy, notably from the class of locally compact spaces X (like N, or R”, etc.). 
For any f : X — C, define the support of f as the closure of the set where f 4 0. 


Definition B.10. Let X be a locally compact space. Then: 


e C(X) is the set of all continuous functions f : X > C; 

e C,(X) is the set of all continuous functions f :X — C with compact support; 

© (€(X) is the set of all continuous functions f : X — C that vanish at infinity, 
i.e., for any € > 0 the set {x € X | |f(x)| > €} is compact, or, equivalently, for 
any € > 0 there is a compact set K C X such that |f (x)| < € for all x € K; 

e C;(X) is the set of all continuous functions f : X — C that are bounded, i.e., 
there is a constant C > 0 (which depends on f) such that |f (x)| < C for all x €X. 


In general, one has the obvious inclusions 


C-(X) € Co(X) € Cy(X) CC(X), (B.22) 


with strict inclusions iff X is non-compact, and equalities iff X is compact. 

For example, if X =R, then f(x) = exp(—x”) lies in Co, whereas f(x) = 1 is in Cy. 
If X is discrete, the space &.(X) and ¢°(X) of the previous section are the same as 
C.(X) and C,(X), respectively, and we may also write £9(X) = Co(X). 

Theorem B.11. The sets C.(X), Co(X), C,(X), and C(X) are vector spaces under 
pointwise operations, and Co(X ) and C;(X) are Banach spaces in the sup-norm 


I| flo = sup{| f(x) |}- (B.23) 
xEX 


In particular, if X is compact, then C(X) is a Banach space in the norm (1.24) 


Proof. Only completeness in the sup-norm (B.23) is nontrivial. We use the fact 
from elementary analysis that sup-norm (i.e., uniform) limits f of sequences (f,) of 
continuous functions exist (they are given by the pointwise limit f(x) = lim, f(x)) 
and are continuous. Therefore, concerning Co(X ) we just need to show that the limit 
f of some sequence (f,) in Co(X) vanishes at infinity. Indeed, for given € > 0, 
since f, — f uniformly, we can find N such that |f(x) — fn(x)| < €/2 for all x 
and alln > N. Since f, € Co(X), we can also find some compact K C X such that 
|fn(x)| < €/2 for all x ¢ K and all n. Hence for x ¢ K andn > N, 


IF) SIF@) — fr@) [+ fn) < €/2+8/2 =e. (B.24) 


To show that the limit f of a sequence (f,,) in C, is again bounded, note that for 
€ > 0 we have | f(x) — fn(x)| < € forn > N and |f,,(x)| < Cy, both for all x, whence 


IF) SIF) — fla) + fn@)| < E+ Gr <, (B.25) 


so f is bounded and hence lies in C,(X). 
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B.4 Basic measure theory 


Measure theory studies measure spaces (X,2,,1), where X is a set, and: 


e YC A(X) isa so-called o-algebra of subsets of X, which means that: 


1xXeE2; 
2. IfA €X, then A° € Y (where AC = X\A is the complement of A); 
3. IfA, €X forn € N, then U,A, € Z (i.e., Y is closed under countable unions). 


It follows that @ € XY, and that Y is closed under countable intersections, too. 
e w:L-> (0,], called a (positive) measure, is countably additive, i.c., 


LH (UnAn) = y'u(An), (B.26) 


whenever A, € 2,n €N, A; A; = 9 for all i ¢ j. The obvious convention here 
is that t +00 = ce for any tf € R™, as well as 00 +00 = o0. Countable additivity is 
indispensable in almost every limit argument in measure theory. 


A probability space is a measure space (X,2, 1) for which U(X) = 1. More gener- 
ally, a measure space is called finite if (X) < 0, which evidently implies U(A) < 0 
for any A € &, and o-finite if X is a countable union X = U,A, with U(A,) < © for 
each n. For example, X = R is o-finite, whilst X = [0,1] with Lebesgue measure is 
finite. The non-o-finite case is pathological and hardly occurs in practice. 


This definition of a o-algebra marks a difference with a topology on X, which is 
a collection @(X) of open subsets (containing X and the empty set 0) that is closed 
under arbitrary unions and finite intersections (but not under complementation!). 
Nonetheless, topology and measure theory are closely related: 


1. Any topological space X gives rise to a o-algebra A(X), viz. the smallest o- 
algebra in A(X) that contains O(X) (this exists and equals the intersection of all 
o-algebra that contain @(X), where one notes that the intersection of any family 
of o-algebras is again a o-algebra). Elements of @(X) are called Borel sets. 

2. The definition of a continuous function f : X — Y between topological spaces X 
and Y as a function for which f—!(V) € @(X) for each V € O(Y), is copied by 
saying that f : X — Y is measurable with respect to given o-algebras Yy (on X) 
and Ly (on Y) if f~!(B) € Ly for any BE Ly. 

3. If X and Y are topological spaces and Ly = B(X), Ly = A(Y), then it it easy 
to show that f is (Borel) measurable iff f~'(B) € Zy merely for any B € O(Y), 
from which it follows that each continuous function is measurable. For f :X > R 
to be measurable it is even sufficient that f—'((t,00)) € Zy for each t € R. 

4. The above condition of o-finiteness is often used just in case the Aj are compact. 


An important goal of measure theory is to provide a rigorous theory of inte- 
gration; here the key idea (due to Lebesgue) is that in defining the integral of some 
measurable function f : X — R, one should partition the range R rather than the do- 
main X, as had been done in the Calculus since Newton (where typically X C R”). 
This, in turn, suggests that f should first be approximated by simple functions. 
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These are measurable functions s : X —+ R* with finite range, or, equivalently, 


s=) Ala, (B.27) 


where A; > 0, A; € Z, and n < o. Such a representation is unique if we require 
that the sets A; are mutually disjoint and the coefficients A; are distinct; namely, if 
{x1,...,Xn} are the distinct values of s, one takes A; = s~!(x;) and A; = x;. Given 
some measure Ll, we further restrict the class of simple functions to those for which 
L(A;) < ce. One then first defines the integral of a simple function s, as in (B.27), by 


[dus =D amlar: (B.28) 


a nontrivial argument shows that the right-hand side is independent of the particular 
representation (B.27) of s used on the left. Granting this, linearity of the integral 
on simple functions is immediate. Subsequently, for positive measurable functions 
f > 0, writing s < f iff s(x) < f(x) for each x € X, one defines the integral by 


[aur=ser{ [ dus|0<s<f,s simple. (B.29) 
x Xx 


For measurable functions f : X — C, one first decomposes f as 


3 
f= fe ha, (B.30) 
k=0 


where, writing f = Re(f)+iIm(f) =f’ +if", fPV=H=fi,h=fi. fi =fi, and 
fs = f", so that f* = f? — f® for ® =,” one may take f$ = 4(|f*|— f°). 
On this basis, one then defines the integral by linear extension of (B.29), that is, 


3 
[aur yi | dul fe. (B.31) 
We call f integrable with respect to 1, writing f € Z'(X,Z,p), if 


[anlti<e (B.32) 
XxX 


this implies that each positive part f;,, and hence also f itself, is integrable, i.e., 


| du f <e. (B.33) 
xX 


However, (B.33) does not imply (B.32). From (B.32) one has the useful estimates 


[fans] s [ anisl sini u00), 6.34) 
xX Xx 
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where the essential supremum of f (with respect to 1) is defined by 
|| fllso° = inf{t € [0,00] | |f| <t p-almost everywhere}, (B.35) 


where |f| <t U-a.e. means that u({x € X | | f(x) > t}) =O}. In (B.34), the expres- 
sions || f||* and/or (X) may well be infinite (in which case the second estimate 
still holds, of course!). However, if X is a locally compact space (see the next sec- 
tion), 1 is finite, and f € Co(X) or even f € C,(X), then all of (B.34) is finite. 

Linearity of the integral is far from trivial: the proof relies on linearity for simple 
functions, as well as on a fundamental approximation lemma: 


Lemma B.12. /f f > 0 is measurable, there is a monotone increasing sequence of 
simple functions Sp, i.e., such that 0 <8, <sq<+++ <8) <Sn41 <-+++ < f pointwise, 
for which s, > f pointwise (i.e., lity 400 8,(s) = f(x) for each x € X). 


Furthermore, one needs one of the two great convergence theorems of measure the- 
ory named after Lebesgue, both of which (for future use) we now state. In these 
theorems (as well as in many others), we say that a measurable functions f : X > C 
has some property H-almost everywhere ([1-a.e.) if the set where f does not have the 
said property has measure zero. For example f = 0 [-a.e. means that f(x) = 0 for 
each x ¢ N, for some measurable set N with u(N) = 0 (as they say, “morally”, the 
behaviour of measurable functions on subsets of measure zero should not matter). 


Theorem B.13. Let (f,) be a sequence of (complex-valued) measurable functions. 


1. Dominated Convergence: if (f,,) converges pointwise [1-a.e. to some function f 
and | fn(x)| < g(x) u-ae. for some g € Y'(X,Z,u), then f € ZY'(X,Z,u), and 


lim I dt fy = a du f. (B.36) 


Nn—yoo 


2. Monotone Convergence: if f,, > 0 and (f,) is monotone increasing U-a.e., and 


sup { I ayn} <0, (B.37) 


then lino fn(x) = f (x) exists u-ae., f € Z'(X,Z,u), and (B.36) holds. 
Note that the first conclusion of the monotone convergence theorem is an assump- 


tion in the dominated one! Either way, the fact that the pointwise limit function f is 
integrable, being implicit in the notation f ¢ #!(X,Z,), is part of the result. 


Corollary B.14. Integration is linear, i.e., if f\, fo are integrable and 1,A2 € C, 


[dutuf+ron) =a [an fita f anh (B.38) 
x x x 


Proof, If fi > 0, f2 > 0, let s\) + f, and s\?) > fy, as in Lemma B.12. Then the 


conditions of the monotone convergence theorem hold, because integration is itself a 
monotone operation (i.e., if f < g, then fy du f < fy dug). Combined with linearity 
on simple functions (as already established above), this yields the claim. 
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B.5 Measure theory on locally compact Hausdorff spaces 


For us it suffices to deal with locally compact Hausdorff spaces X. Our main goal 
is Corollary B.21. We say that a map @ : C(X) — C is positive if o(f) > 0 whenever 
f = 0 (pointwise). We also write @(X) for the set of open subsets of X, whilst .# (X) 
denotes the set of all compact subsets of X. We first assume that X is compact. Any 
finite measure p! : A(X) — [0,-2) gives rise to a positive linear map @ : C(X) > C, 


o(f) = L duf, f €C(X). (B.39) 


Conversely, any such map canonically defines a finite measure ju at least on opens 
U € @(X) and on compacta K € .% (X) (which are key examples of Borel sets) by 


u(U) = sup{g(f) | f €C.(U),0<f < 1x}; (B.40) 
U(K) = inf{@(f) | f € C-(X),0< f < 1x, fix = Ix}. (B.41) 


Subsequently, this preliminary measure is (hopefully!) to be extended to at least all 
of A(X), i.e., to all Borel sets, in such a way that ps recovers @ via (B.39). 

This works, and one even obtains a bijective correspondence between finite mea- 
sure spaces (X,2, 1) and positive linear maps @ : C(X) — C if the former are sub- 
jected to two additional conditions, predicated on having A(X) Cc LY, namely: 


e completeness, in that u(B) =O andA CB forA € A(X), BEL imply A € Z; 
e regularity, i.e., for a given measure U : © — [0,0], for any A € X, one has 


(A) = p.(A) = WA), (B.42) 
where the outer measure * and inner measure [,. are defined by 


u*(A) =inf{u(U) |U DAU € O(X)}; (B.43) 
u.(A) = sup{u(K) | K CA,K €.4(X)}. (B.44) 


respectively. These expressions apparently make sense for all subsets A C X, but 
lovers of the Banach—Tarski Paradox may be reassured that u* and w, typically 
fail to be countable additive if they are seen as maps from #(X) to [0,]. 

For future reference we also define (X,2,[) to be inner regular if (merely) 
L.(A) = U(A) for A € XY, and outer regular if (merely) u*(A) = (A), A € Z. 
So a regular measure is both inner and outer regular. We are now in a position 
to state the Riesz Representation Theorem (often attributed also to Radon). 


Theorem B.15. Let X be a compact Hausdorff space. There is a bijective corre- 
spondence between complete regular finite measure spaces (X,Z,[1) and positive 
linear maps @ : C(X) > C, explicitly given as follows: 


e The measure space (X,2,) defines @ through (B.39), assuming (B.29) - (B.31); 
e The map @ defines the pair (Z,[) in three steps: 


B.5 Measure theory on locally compact Hausdorff spaces 527 


1. Ut is given on opens U and on compacta K by (B.40) and (B.41), respectively; 

2. & is defined as the collection of all sets A € Y(X) for which u*(A) = M,.(A); 

3. pt is given on all of X by p(A) = y*(A), using (B.43), or, equivalently (given 
the previous point), by U(A) = Lx(A), based on (B.44). 


We omit the lengthy proof, expect by announcing that Theorem B.15 may be seen as 
a special case of the more advanced Choquet theory reviewed in §B.11. For now, just 
note that expressions like (B.40) and (B.41) are really desperate attempts to define 
“u(A) = @(1,4)”, which is OK for finite X, but in general is ill defined because even 
for Borel sets A, the characteristic function 1,4 is rarely continuous on X. 

We note that 1 has to be finite, since obviously 1(X) = @(1x). One can say a 
little more about this. A linear map 9 : C(X) — C is bounded if, for some 0 < C <, 


lP(F)|SCllflleo (FE C(X). (B.45) 


In that case, the following expression, called the norm of @, is < C, hence finite: 


Pll = sup{lo(P)|,f © C(X), || fllee = 1. (B.46) 


Proposition B.16. Let X be a compact Hausdorff space. If a linear map 9: C(X) > 
C is positive, then it is bounded, with norm 


|p|] = p(x). (B.47) 


Proof. Positivity makes (f,g) = @(f*g) a pre-inner product on C(X), so by (A.1) 
with v= ly and w= f, we find |@(f)|* < @(|f|*)@(1x) for any f. If || f ||. = 1, then 
pointwise 0 < ||? < ly, so by positivity, (||?) < @(1x). Hence |@(f)| < (1x), 
so that ||| < @(1y). Finally, taking f = 1y in (B.46) gives equality. 


A state on C(X) is a positive linear functional @ : C(X) > C with @(1y) = 1. 


Corollary B.17. If X is a compact Hausdorff space, there is a bijective correspon- 
dence between states on C(X) and complete regular probability measures on X. 


We now move to the next case in difficulty, where X is assumed to be o-compact, 
in being a countable union of compact sets, i.e., X = U,Kn, where K, € % (X). Us- 
ing a little topology, this is actually equivalent to X being a perhaps more appealing 
union X = U,U;, where each U,, is open with compact closure U,,, and U;, C Un4+1. 


This, in turn, implies that X = U, Kj, with Kj, C Kj, all compact. If (X,u,0) is a 


n= n 


measure space where X is o-compact topologically, A(X) C XY, and 
UK) <9, (K €.#(X)), (B.48) 


then X is also o-finite measure-theoretically. Since these are the only o-finite mea- 
sure spaces we will consider, with a slight change in terminology we call a locally 
compact measure space (X,Z, 1) 0-finite if it is also o-compact and (B.48) holds. 
The new point compared to the compact case is that functionals like the above 
@ should now be defined on the space C,(X) of continuous functions on X with 
compact support. Otherwise, Theorem B.15 may be repeated almost verbatim: 
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Theorem B.18. Let X be a o-compact Hausdorff space. There is a bijective corre- 
spondence between complete regular o-finite measure spaces (X,2, 1) and positive 
linear maps @ : C.(X) — C, explicitly given as in Theorem B.15. 


For the sake of completeness we also state Theorem B.18 in the case where X is not 
even assumed to be o-compact. In that case, inner regularity may be lost: 


Theorem B.19. Let X be a locally compact Hausdorff space. There is a bijective 
correspondence between complete outer regular measure spaces (X,L, WU) satisfying 
(B.48), and positive linear maps @ : C.(X) > C, explicitly given as in Theorem B.15, 
except for the fact that © now consists of allA © A(X) for which U(ANK) < and 
(AN K) = W(ANK) for any K € #(X). In that case, p is defined by 


u(A)=p*(A), AES. (B.49) 


However, this generality will not really be needed for our purposes, which will 
only require finite measures, in which case outer regularity implies regularity. 

In order to generalize Corollary B.17 to the o-compact case, or even to the lo- 
cally compact case, we must involve the Banach spaces C,(X) and Co(X) of the 
previous section. Also for linear maps @ : C.(X) > C or @: Co(X) > C we use the 
notation (B.46), where now the supremum is taken over f € C,(X) and f € Co(X), 
respectively. For example, in the latter case, provided (B.45) holds, we have 


ell = sup{lo(/)|,f € Co(X), Il flleo = 1}- (B.50) 


Lemma B.20. Let X be a locally compact Hausdorff space. 


1. C.(X) is a dense subspace of Co(X) with respect to the norm (B.23). 
2. For a positive linear map @ : C-(X) > C, the following are equivalent: 


a. @ is bounded, as in (B.45); 
b. 9 can be extended to a positive linear map @ : Co(X) > C. 


In particular, a positive linear map © : Co(X) — C is automatically bounded. 


Proof. 1. The first claim means either of the following two equivalent properties: 


e For any f € Co(X) there is a sequence (f;,) in C.(X) converging to f; 
e Forany f € Co(X) and € > 0 there is g € C,(X) with || f —g|| <e. 


We prove both. For some given f € Cg and € > 0, find the usual compact K such 
that | f(x)| < € outside K. Urysohn’s Lemma gives h € C,(X) with 0 < h(x) <1 
for all x € X and h(x) = 1 for all x € K. Take g = fh € C,(X), so that || f — gllo < 
€. For € = 1 /n, rename the g thus constructed as f,. Then || f — fille. > 0. 

2. To go from 2.a to 2.b, using the previous item, let f, — f uniformly (i.e., in the 
sup-norm), and define the extension  : Co(X) — C by @(f) = lim, @(f;,). This 
limit exists, since |@(fin) — @(fn)| < Cll fn — fnlleo, so that, (f,) being convergent 
and hence Cauchy in Co(X), the sequence (@(f,)) is Cauchy in C. The value 
@(f) is easily verified to be independent of the approximating sequence (/,,). 
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Finally, the approximation in 2.a preserves positivity, i.e., if f > 0 then f, > 0, 
so also @(f) > 0, as it has been defined as the limit of a positive sequence. 


By definition, the converse implication 2.b — 2.a is equivalent to the claim that 


sup{|9(f)|,f € Co(X);Ilfllee < 1} <e, (B.51) 


which in turn is equivalent to the apparently weaker claim to the effect that 


sup{|@(fn)|,2 © N} <-, (B.52) 


for any sequence (f;,) with || f,|| < 1. Indeed, if the first supremum were infinite, 
then for each n € N there is f,, such that |@(f;,)| >, and (B.52) could not possibly 
hold. Furthermore, (B.52) need only hold for non-negative functions f, > 0 (still 
with || f,|| <1, of course) cf. (B.31), since |@(fx)| = @(f) < C for each k = 
0,...,3 implies |@(f)| < 4C. And this, finally, reduces to the claim that 


Y" 2(n) (fr) <~, Ve € £'(N),g(n) > 0. (B.53) 
n=1 


Namely, if the sequence ((jf;,)) where unbounded, it would be trivial to find 
such a summable function g for which the sum in (B.53) diverges (for example, 
take a subsequence for which @(f;,,) > m and take g such that g,,, = 1/m”). 

To prove (B.53), then, given that f,, > 0 and hence (f,) > 0, with || f,,|| <1, first 
note that ),, g(7) fn converges in Co(X) (since it is obviously absolutely conver- 
gent, and any absolutely convergent series in a Banach space converges). Calling 
the sum h, for any N < © we have ae g(n)fn <h and hence, by positivity of 
g, also y*_, ¢(n)@(fn) < (h) < ©. Letting N — © gives (B.53). 


We now define a state on Co(X) as a positive (and hence bounded) linear functional 
@ :Co(X) + C with || @|| = 1; this is consistent with the terminology for the compact 
case because of (B.47), as well as with the terminology for C*-algebras. 


Corollary B.21. Let X be a locally compact Hausdorff space. There is a bijective 
correspondence between positive linear functionals on Co(X ) and complete regular 
finite measures on X, explicitly given as in the bullet points of Theorem B.15. 

In particular, states on Co(X ) correspond to regular probability measures on X. 


Proof. All that remains to be shown is that, under (B.39), we have 


|Pll = u(x), (B.54) 


so that, in particular, the case ||@|| = 1 corresponds to u(X) = 1. For compact X, 
eq. (B.54) is immediate from (B.47). For locally compact X, we immediately see 
from (B.39) and (B.50) that ||@|| < u(X). To saturate this inequality, we use inner 
regularity of the measure U corresponding to @, cf. Theorem B.19 and subsequent 
comment. From (B.42) and (B.44), for any € > 0 we can find K € “(X) with 
L(X) — U(K) < €. Now use Urysohn’s Lemma to find f € C,(X) such that 0 < f <1 
and fix = 1. Then g(f) > u(K), and, letting € — 0, eq. (B.54) follows. 
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Finally, we extend the above corollaries to the entire (Banach) dual Co(X)*, i.e., 
the space of all (i.e. not necessarily positive) bounded linear maps @ : Co(X) > C, 
equipped with the norm (B.50). As we shall see more generally in §B.9, this is a 
vector space (under pointwise operations) and even a Banach space in its own right. 

From the point of view of measure theory, the relevant concept is that of a com- 
plex measure. This is a map pt : © — C satisfying the countable additivity condition 
(B.26), as in the positive case. In the complex case this condition implies that ju is 
finite. One then (trivially) has a decomposition u = yw’ +i”, where py’ and py” are 
countably additive maps X — R (just take pw’ = $(u+*) and uw” = —JFi(u—p"*), 
where U*(A) = (A)), and (nontrivially) has the (Hahn)—Jordan decomposition: 


Theorem B.22. Let X be a o-algebra on a set X and let wu be a (finite) signed mea- 
sure, i.e.,a countably additive map © — R. Then there is a unique decomposition 


M=Hy—pL, (B.55) 


where the measures 4. : £ —> R* are given by: 


u.(A) = sup{u(B) | BCA,Be I}; (B.56) 
u_(A) = —inf{u(B) | BC A,B EZ}, (B.57) 


and [4 and w_ are mutually singular in that there is a set N € X such that 
1,.(N) = w_(X\N) =0. (B.58) 
We will not prove this, just noting that in terms of the total variation || of [, i.e., 


|u|(A) = wn y win} (B.59) 


neN 


where the supremum is taken over all measurable partitions A = U,Ay, one has 


Ms = 5(|H| =H). (B.60) 


Fom the point of view of C*-algebras it is more natural to start from bounded 
linear functionals on Co(X). First, we call a map @ : Co(X) > C hermitian if 


o(f*) =9(f) (f(x) = f(a). (B.61) 


Theorem B.23. 1. Any functional 9 € Co(X)* has a unique decomposition 


9=9' +i", (B.62) 


where the functionals ~' € Co(X)* and @" € Co(X)* are hermitian. 
2. Any hermitian functional @ € Co(X)* has a decomposition 


P= 9+-@_, (B.63) 
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where the functionals @s. € Co(X)* are positive, and are given on f > 0 by 


p+(f) = sup{@(g),g € Co(X),0<¢< f}; (B.64) 
p_(f) = —inf{p(h),h € Co(X),0<h< fh. (B.65) 


3. These expressions satisfy 


ll =llo+ll+1le-I, (B.66) 


and any positive functionals @+ € Co(X)* that satisfy (B.63) as well as (B.66) 
are necessarily given by (B.64) - (B.65). 
4, Any functional @ € Co(X)* is a linear combination of at most four states. 


Proof. 1. Take 9’ = 5(9 + @*) and ” = —}i(@— 9*), where 9*(f) = 9(f*). 

2. The range h:0<h< ff is the same as the range h:0 < f—h< f, so that 
(B.64) - (B.65) gives (B.63). Positivity of @, follows because the value g(0) = 0 
is included in the supremum in (B.64), which therefore can only be > 0, and 
likewise — @_ is negative (and hence @_ is positive) because @(0) = 0 is included 
in the infimum in (B.65), which therefore can only be < 0. 

3. We first prove (B.66) for compact X, so that ly € Co(X) = C(X). From (B.47), 


lol < lov +l} = Cx) +(x) (B.67) 
= sup{9(g),0<g <lx}—inf{p(h),0<h<1}. (B68) 


For any € > 0, there is g such that @(g) is close to the supremum in (B.68) by 
5€, and likewise there is h such that @(/) is close to the infimum in (B.68) by the 
same amount, so that 


|p.(1x) + 9x) — o(g—h)| <e. (B.69) 


Since 0 < g < ly andO0 <h< 1, we have ||g —Al| < 1, and thereore 


e(g—h) <|lolllg—All < loll. (B.70) 


Hence (B.67) gives 


loll <o+ll+lle-ll< llell+e, (B.71) 


so letting € — 0 yields (B.66). 


For locally compact X, we reduce the proof to the compact case by forming the 
one-point compactification X of X, cf. §C.6. As a set, this is X = X U {oo}, where 
co is a singleton. As a space, the open sets in X are the open sets in X plus those 
subsets of X whose complement is compact in X. The obvious injection i: X  X 
is continuous, and any f € Co(X) extends uniquely to a function f € C(X) that 
vanishes at the compactification point, i.e., f(co) = 0. This yields an isometric 
embedding Co(X ) > C(X). Furthermore, as vector spaces one has 
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C(X) =Co(X) ®C: ly. (B.72) 

Any linear map @ on Co(X) may then be extended to a linear map @ on C(X) via 
O(F+Aly) = OF) +AlOlL, fe Co(X),A €C. (B.73) 


From the point of view of (B.39), this extension may alternatively be described 
as follows: extend the measure 1 on X that underlies @ to a measure {1 on X by 
[(AU {cc}) = (A), A € XZ. This shows that @ remains positive when @ is, and 
using (B.54) and the analogue of (B.47) for X instead of X, we also obtain 


Ill = Ux) = A(X) = HX) = lle]. (B.74) 


One may then repeat the proof of the compact case, using @ instead of @. 
We just prove uniqueness for the compact case (in general, add dots as in the 
previous proof). Suppose @ = 9/, — g!. For f > 0, using (B.64) and g! (g) > 0, 


px(f) = sup{'.(g)— 9! (g),0<g<f} 
< sup{@'.(g),0<g<f}<o@l(f), 


/ 


so W = @, — 4 > 0. With gf = gs + y, imposing |||] = || + ll@.|| and 
repeatedly using (B.47), we find ||y|| = 0, and hence y = 0. 

4. This is trivial from parts 1—2, noting that any nonzero positive functional @ = tw 
is a multiple of a state @ = @/||@||, with t = ||@||, since obviously ||@|| = 1. 


Combining this proposition with Corollaries B.17 and B.21, we finally obtain: 


Theorem B.24. Let X be a locally compact Hausdorff space. The Banach dual 
Co(X)* of all bounded linear maps @ : Co(X) — C is isometrically isomorphic with 
the space M(X) of all complete regular complex measures on X, with norm 


[Hl] = |HI(X). (B.75) 


In particular, if pt is real (i.e., hermitian as a functional on C(X)), then (cf. (B.55)) 


eel] = W(X) + HX). (B.76) 


This implies Corollary B.21, including its crucial final claim to the effect that states 
on Co(X) correspond to regular probability measures on X. 

We briefly sketch an analogous result for finitely additive measures. Instead of a 
o-algebra of subsets of some set X, we now start from a so-called semiring: 


Definition B.25. A semiring of subsets of X is a family 2 C A(X) such that: 


LOE; 

2. if A,B EZ, thn ANBe &; 

3. if A,B € & and B CA, then for the complement of B in A we have A\B = U'!_,Bi, 
where n < ©, each B; € &, and the B; are pairwise disjoint. 
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In fact, in all our examples a stronger version of axiom 3 holds: if A,B € & and 
B CA, then A\B € &. Indeed, we will typically have X = N and either Y = A(N) 
or # = P(N) (ie. the collection of finite subsets of N). 

Using the fundamental lemma for semirings, which states that if A,,...,An € Z&, 
there are finitely many pairwise disjoint By,...,B,, in Z such that U,An = UmBrm, it 
can be shown that the complex linear span Step(X , @) of the characteristic functions 
1, (A € &) is a commutative algebra under obvious pointwise operations. Since 
functions on Step(X,#) are bounded, we may form the closure of Step(X , Z) in the 
supremum-norm; adding pointwise complex conjugation this yields a commutative 
C*-algebra called ¢°(X,) (which has a unit iff X =U). For example, we have 


e°(N, A(N)) = C(N) =; (B.77) 
L°(N, P;(N)) = o(N) =Cco- (B.78) 


Definition B.26. A finitely additive measure on (X,Z) is a map wp: & > [0,~] 
such that U(AUB) = (A) +H (B) whenever A,B EZ, AUBE &, andANB=09. 


Similarly, we have finitely additive signed measures taking values in R, which admit 
a Jordan—Hahn decomposition (B.55) with (B.56) - (B.57), just as in the o-additive 
case. We say that a finitely additive signed measure p is finite if |(A)| <0 for each 
A€ &, and bounded if sup{|(A)|,A € Z} < . With |u| = uw, +p_, the bounded 
finitely additive signed measures form a real Banach space ba(X,#) in the norm 


||| = sup{|H|(A),A € Z}. (B.79) 
Within this space, the probability measures stand out as those measures [ that take 
values in [0, 1] (so that f = 1) and satisfy |||] = 1. 


Functions in Step(X,#) may be integrated against measures in ba(X,@) in the 
obvious way, cf. (B.27) - (B.28). This is well defined, and one easily infers that 


| [ aus| < lnlllslo. (B.80) 


for any s € Step(X,#). Hence we may extend the integral to any f € (°(X,@) by 


duf=lim | dpsp, (B.81) 
Xx noo JX 
where (s,,) is any sequence in Step(X,#) converging to f in the sup-norm || - ||... 


This is well defined by the usual arguments. At the end of the day, we obtain: 
Theorem B.27. Let X be a set equipped with some semiring 2 C P(X). 


e There is a bijective correspondence between finitely additive probability mea- 
sures pt on (X,@) and states @ on £°(X,B), given by (B.39) and (B.81). 

e This correspondence extends to an isometric isomorphism between ba(X ,#) and 
the real Banach space of bounded hermitian functionals on €°(X ,Z). 

e This isomorphism of real Banach spaces extends (i.e. complexifies) to an isomor- 
phism between the complexification ba(X ,#)c and the (Banach) dual €*(X,#)*. 


534 B Basic functional analysis 


B.6 L? spaces 


We return to the usual, countably additive setting for measure theory. In the previous 
section, the notion of a measure space (X,Z,) has mainly been used to provide an 
integration theory for continuous functions on X, though (B.29) suggested greater 
generality. In what follows, we keep the restriction to locally compact spaces X 
(although the theory is more general), but we expand the class of functions that 
can be integrated over X “against the measure W”’. This, then, leads to an important 
class of Banach spaces, called L?(X) = L?(X,Z,); some authors write L?(X,), 
others L?(u). One may have examples like X = Q C R" in mind, with Q measurable 
(typically open or closed, like X = R” or X = [0, 1]), and being Lebesgue measure. 
On the other hand, one may think of X as a discrete space with counting measure 
(i.e., U({x}) = 1 for each x € X), in which case the space L?(X) will reduces to the 
space ¢?(X) we already know; the typical case will be X = N. 


Definition B.28. Given a measure space (X,Z,[) and a real number 1 < p<: 


e For1<p<-.», the set ?(X) = L?(X,Z,uU) consists of all of measurable 
functions f : X — C that are essentially bounded (with respect to |), i.e., 


vA du|f|? <0. (B.82) 
e £°(X) = L*(X,,U) is the set of measurable functions f : X — C for which 
inf{t € [0, 0°] : | f| < t (u-almost everywhere)} < %. (B.83) 
e %, is the set of all measurable functions f : X — C that vanish p1-a.e., that is, 
u(fx EX | f(x) £0}) =0. (B.84) 
e Noting that Ny, C 2?(X) forall 1 < p<, we put 
EP (X,2.p) = P(X) = LP X) My. (B.85) 
To appreciate the perhaps somewhat mysterious condition (B.83), we write 
inf {t € [0,00]: |f| <tu—a.e.} = inf {rt € [0,00] : w({x © X, |f(x)| > t}) =O}. 
Compare this with the expressions (defined for any function f : X — C): 


sup{|f(x)| |x €X} = inf{t € [0,00] : |f(x)| <tVx eX} 
= inf{t € [0,00] : {x € X,|f(x)| >t} =O} <0, (B.86) 
which state the condition that f be bounded. Consequently, the stipulation that f be 


essentially bounded is the same as the condition that it is bounded, expect that the 
empty set in (B.86) has been replaced by a measure-zero set. 
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Theorem B.29. For 1 < p < ~, the set L?(X) is a vector space under pointwise 
operations, as well as a Banach space, in the norm 


1/p 
Ifllo=(farsirear) 6.87) 
Likewise, L®(X) is a Banach space in the norm 


I[f\lo° = inf {t € [0,0] : u({x € X, |F(x)| > t}) = OF. (B.88) 


Strictly speaking, elements of L? are therefore equivalence classes of functions 
rather than functions, the pertinent equivalence relation ~, being 


f wu g iff u({x € X | f(x) A a(x) }) =9, (B.89) 


but whenever no confusion can arise, we write f € L? instead of f € Y? or [f] € L’, 
as we have already done, for example, in (B.87) and (B.88); that is, the left-hand 
sides of these equations should officially be written as ||[f]||, for 1 < p < 0. Note 
in this respect that in (B.87) - (B.88) the function f on the right-hand side could 
be any representative of its equivalence class [f]. However, one cannot replace the 
right-hand side of (B.88) by || f||.., because (B.86) does depend on the representative 
f. Those who dislike (B.88) may, equivalently, write 


I f\l2° = inf{||g|l0.8 ~u FF. (B.90) 


One should be aware of the need to pass to the quotient (B.85) in the first place: 
the natural expressions (B.87) and (B.88) fail to define norms on #? and 2”, 
respectively, because the positive definiteness axiom in Definition A.1.5c might fail. 
Indeed, although any f that is nonzero just on some null set is nonzero as an element 
of the vector space “?, one has || ||, = 0. This problem is solved by passing to L”. 

The proof of Theorem B.29 uses both parts of Theorem B.13, which is concerned 
with a sequence (f;,) of functions in .#!(X), where (X,, ) is an arbitrary measure 
space. Note that on our definition of L? spaces, these pointwise limits themselves 
might not lie in #!, but it is part of the conclusion of the convergence theorems that 
they do so up to some null set, and hence do define elements of L!. For this reason, at 
this point one must distinguish between f ¢ 7! and [f] € L!. Let us mention in this 
context that L? spaces are often constructed from measurable functions f : X > C, 
whose positive real parts f;, (cf. (B.31)) by definition take values in [0,0°]. This also 
leads to slightly more general versions of the Lebesgue convergence theorems, in 
which the f;, are allowed to be infinite on null sets. However, if f € L”, then | f| < 0 
L-a.e., so little is lost by starting from functions f:X > Cor f:X —R. 


Proof. We first prove Theorem B.29 for 1 < p < oo, Minkowski’s Inequality (B.14) 
holds for L? = L?(X) just as it does for @?, as does Hélder’s Inequality (B.15), so 
it remains to prove completeness. To this effect, let (f,) a Cauchy sequence in L?. 
Then (f,) has a subsequence (fh, ), such that 
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Il fuss — Snell aa (B.91) 


for each k € N (indeed, for given ¢ = 2~*, take ng to be the famously existing N for 

which || fn — fn||p < € for all n,m > N, etc.), and if limy_+.0 || fn, — || p = 0 for some 

f, then lim, +. || fn — f |p = 0 (this is a standard feature of Cauchy subsequences). 
We now rewrite fy, using a little trick, and introduce an auxiliary function g by 


k-1 
Sag = fy + Y Prag — Fay) (B.92) 
1=1) 
k-1 
8m, = lfml+ ¥ for — Srl: (B.93) 
l=1 


Using (B.91), we estimate ||g,, || p < ||,fn Ip HLizy 2-1, which converges as k — 00. 
Hence sup, || gh. ||, <.°°, so by the Monotone Convergence Theorem, lim,_,.. oh: =h 
exists pointwise [-a.e., with h € L'. Since &n, = 0, we have h > 0 at least p-a.e., 
and with g = h!/?, by continuity of x > x!/?, we have &n, —> & pointwise L-a.e., 
with g € L?. Thus the series (B.92) converges (absolutely pointwise -a.e.) to some 
f. Since |f| < g, we also have f € L’. To prove that f,, — f in L? (and not just 
pointwise [-a.e.), we estimate 


IF (2) — fry)? S (2max{| FI, Ling CE)? 
S 2P(F() + Fg (21)? S 2? Tax)’, 


so, already knowing that g? € L', we may use (B.36) in the Dominated Convergence 
Theorem (with f, replaced by f — fn,, and hence f replaced by the zero function) 
to conclude that limy_,.. fy dU f(x) — fr, (*)|? = 9, Le., || f — fry llp 2 0. 

We continue for p = o». For any fixed measurable subset E C X we define 


FIL = sup{|f(x)| |x € E} = inf {r € [0,00] | |f(x)|<tVx EE}. — (B.94) 
If X\E has measure zero, as we assume in what follows, then 
lls < WA, (B.95) 


since E might be expanded to a larger set of measure zero, which might decrease 
the infimum in (B.88). It follows that convergence with respect to the norm || - ie ) 
implies convergence in || - ||SS°. We use this insight to prove the completeness of L® 
by reducing this to a limiting problem with respect to the norm || - ha ) for a suitable 
choice of E C X. Namely, let (f,) be a Cauchy sequence in L”. This means 


Ve> ONY j kon || Fi — Fell <€E. 


Parametrizing € = 1/m for large m € N, and using (B.88), this implies: 


Vn j,e>n IN(j km)? H(N(,k,m)) = 0 and Wx € X\N(j,k,m) ¢ [FG () — fee)| < 1/m. 
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Now define N = Uj,4,.mcNN(j,k,m)- Since measures are countably additive by defini- 
tion and N is a countable union of the measure zero sets, N has measure zero. With 
E = X\N, so that X\E =N has measure zero, as above, we then have 


Van j kon VX © E | fj (x) — f(x) | < 1/m. 


Thus (f,,) (strictly speaking, the corresponding sequence of restrictions of each f, to 
E) is a Cauchy sequence of bounded functions on E in the supremum norm (B.94), 
so that we are back in the ¢°(X) case with X = E, with the three-step proof we 
gave: the pointwise limits f(x) = lim, 5... f(x) exist, the function f thus defined on 


E is bounded, 1.e., male < oo, and f, — f not just pointwise but also in the norm 


|| - Is ), Extending f from E to X in an arbitrary way (the ensuing equivalence class 
in L® does not depend on the behaviour of f on the null set X\E), we first conclude 
from (B.95) that || f||S* < 00, and secondly infer that f,, > f also in || - ||S*. 


Without proof, we state some useful results about the place of continuous func- 
tions in L?-spaces. For simplicity, we assume that u is regular and has support X (in 
that X has no open subset U with u(U) = 0). In that case, C;,(X) and its subspaces 
Co(X) and C,(X) may be seen as subspaces of L®(X), on which the norm (B.88) or 
(B.90) simply reduces to the ordinary sup-norm (1.24). 


Theorem B.30. e [f 1 < p<, then C,(X) is dense in L?(X) (in the L?-norm). 
e If p=~, one has an inclusion of Banach spaces (all carrying the L”-norm) 


Co(X) C Cy(X) C L*(X). (B.96) 


Compare (B.22). Since the closure C,(X) is Co(X), it follows that C,(X) is dense in 
L®(X) only in the exceptional case where L®(X) = Co(X) (e.g., for finite X). So in 
this respect, the values 1 < p < ~ behave quite differently from p = ~. 

The first claim is based on two facts, of which the first is true for all | < p<, 
whereas the second is valid only for 1 < p < © (ie. it fails for p = %): 


1. The set S(X) of simple functions s = ));A;14,, where "1(A;) < for each i, is 
dense in L?(X); 

2. For each measurable subset A C X with U(A) < ov», and each € > 0 there is a 
function g € C,(X) such that ||14 —g||p < €. 


Similarly to Theorems B.27 and B.24, we know the state space of L*(X,v): 


Theorem B.31. Let (X,,v) be a measure space. There is a bijective correspon- 
dence between states on L”(X,v) and finitely additive probability measures M on 
(X,2) that are absolutely continuous with respect to v (i.e., V(A) = 0 implies 
L(A) = 0), given by (B.39) and (B.81). 


In this case, the role of the semiring # is of course played by Y, so that Step(X,Z) 
is simply the complex linear span of the simple functions on (X, 2), and (B.28) duly 
applies. Since it may once again be shown that Step(X,2) is dense in L*(X,v), the 
definition (B.81) of integration “by continuity” makes sense in this situation, too. 
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B.7 Morphisms and isomorphisms of Banach spaces 


We often want to say that two Banach spaces are isomorphic. For example, in the 
next section the dual of a given Banach space is typically identified with some 
known Banach space; such identifications even belong to the nicest results in func- 
tional analysis. Of course, this issue is predicated on the correct definition of (not 
necessarily invertible) maps between Banach spaces in the first place. 


Definition B.32. A morphism a: V — W between Banach spaces V,W (or, more 
generally, normed spaces) is a bounded linear map, i.e., a linear map for which 
there is a constant C > 0 such that for eachv €V, 


lavllw < Clly|lv, (B.97) 


or, equivalently, 
sup{|lav|lw.v € V,|lvllv < 1} <0. (B.98) 


It is extremely important (yet easy to show) that bounded maps are automatically 
continuous (and even uniformly continuous); conversely, a continuous linear map 
between vector spaces with norm is bounded. We note two important special cases: 


e IfW =V,a morphism a: V — V is called a (bounded) operator on V. 
e IfW=C,a morphism @:V — Cis called a (bounded linear) functional on V. 


Theorem B.33. Let V be a normed vector space and W a Banach space. The space 
B(V,W) of all morphisms (i.e., bounded linear maps) a: V — W is a Banach space 
with respect to pointwise operations (e.g., (Aa+b)v = Aav + bv), and the norm 


lla|| = sup{|lav|lw.v €V, [lvllv < 1}. (B.99) 


Proof. Only completeness is nontrivial; the idea is that if (a,) is a Cauchy sequence 
in B(V,W), we define a: V > W by av = lim, ayv. This limit exists, since we have 
anv — Amv|lw < ||an — aml||||v||v. Furthermore, it is easy to show (e.g., by con- 
tradiction) that a Cauchy sequence must be bounded, say ||a,|| < K, and that, if 
anv — w, then also ||a,v||w > ||w||w. Hence |lav||w = lim, ||anv|lw < K]|lvl|y, so 
a € B(V,W). Finally, a, — a, since for ||v||y < 1 and, given € > 0, the usual N for 
which ||ay, — am|| < €/2 for all n,m > N and ||av—anv|lw < €/2 for allm > N, 


lav — anv|lw < |lav—amvllw + |lamv — anv|lw < 4e+4€ =e. (B.100) 


Since this holds for any v € V with ||v||v < 1, eq. (B.99) gives ||a—ay|| < €. 


Clearly, if a € B(V,W), then one has the useful estimate, cf. (A.20), 
llavllw < |lalllivilv, (B.101) 
and if W = V anda,b € B(V) = B(V,V), we also have (cf. (A.21)) 


Ilab|| < |la|||l5I)- (B.102) 
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Indeed, B(V) is a Banach algebra, which is just to say that it is a Banach space as 
well as an algebra, in which (B.102) holds (a C*-algebra will be a special case). 

Returning to our opening theme, the level of discourse now suddenly becomes 
quite advanced. We start with Banach’s famous Open Mapping Theorem. 


Theorem B.34. if V and W are Banach spaces and a € B(V,W) is surjective, then 
a is open (in mapping open sets to open sets). 


Proof. For fixed u € V we write V,(u) = {v € V: ||u—v|| <r} for the open r-ball 
around u, with V, = V,(0) and hence V,(u) = u+V,. Furthermore, the closure of 
U CV is denoted by U~. Likewise for W. The theorem follows if aV; = a(V;) C W 
contains an open ball W,, for some s > 0 (in which case, by linearity, aV, contains 
an open ball W,. for any r > 0). By the theory of metric spaces, some subset U C V 
is open iff for any u € U there is r > 0 such that V.(u) C U. Then aU contains the 
open set W,;(au), and since au € aU is arbitrary, aU is open by the same criterion. 

To prove that aV contains an open ball, first note that since a : V — W is sur- 
jective, W = U,aV,, so that by the Baire Category Theorem (which applies because 
Banach spaces are complete metric spaces by definition) some (aV,,)~ contains an 
open set, and hence an open ball. Since a is linear this must then be true for all 77; let 
us take n = 1, so that We(wo) C (aVi)~ for some wo € (aVi)~. Since any point in 
the closure of some U C W can be approximated by points in U, there is wy € aVi 
such that ||w 1 — wol| < 3€. Hence for any w € W,/ we have 


I|Gv1 — w) — woll < wi — woll + llwl| < Je + 4e =e, (B.103) 


so wi —w € We(wo) and hence w; —w € (aV,)~. Similarly, w; + w € (aV)~. Since 
w= 35(wi+w)—35(w1—w), we obtain w € (aV,)_, for if x,y € (aV,), then we have 
3(x+y) € (aV))~. Since w € We, was arbitrary, it follows that Wz/2 C (aVi)~. 

To produce an open ball in aV; rather than in its closure, let Wo EW, /4, SO that 
2wo © We/2. Hence there exists w/, € aV; such that ||2wo — w|| < €/4. And because 
2(2wy — Ww) © Wey, there exists w, € aV; such that ||2(2wy —w}) — w4|| < €/4, et 
cetera. Because 2(2(2wy — w)) — wy) € We, there exists w, € aVi, ... 

Repeating this N times, we obtain a sequence (w/,) in aV; such that for any N EN, 


2 i ai we = 2 wll Se, (B.104) 
i.e, ||wy —LN_, 27"), || < 2-"7e. Letting N — then gives wy = Le, 2-"w',. 
Since wi, € aVj, there is a corresponding sequence (v/,) in V; such that av, = wi, 
with ||v/,|| < 1 for each n. Hence we may estimate Y*_, |/2~"vj,|| << Y_,2-" = 1, 
so the series Y,,2~-"vi, in V is absolutely convergent and hence convergent. Since V 
is assumed complete, it has a limit v = )"_,;2~"v/,. Since 


N 
—n,J 
yen, 
n=1 


N co 
SY 2 yall < V2 yall <1, 
n=1 n=1 


letting N — © gives ||v/|| < 1, or v’ € V,. Since a is bounded and hence continuous, 
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av’ =a (i >) =) 2 "av, = ¥ 2-"wi, = wo. (B.105) 
n=1 n=1 n=1 


We now recall that wy € W, /4 was arbitrary, so we have shown that W, 4 C a(V, ). 
By linearity of a, it follows that W, C aV; for any s < €/4. 


Corollary B.35. Let V and W be Banach spaces. The (set-theoretic) inverse a~! of 
a bijective morphism a € B(V,W) is automatically linear and bounded. 


In other words, a~! lies in B(W,V). Corollary B.35 suggests defining two Banach 
spaces V and W to be isomorphic if there exists a bijective morphism a € B(V,W) 
(in which case they would be isomorphic as objects in the category of Banach spaces 
with bounded linear maps). However, we often prefer to use a sharper notion. 


Definition B.36. Let V and W be normed spaces. 


1, An isometry from V to W is a linear map u: V — W satisfying 
llav|lw = |lvllv. ve V. (B.106) 


2. An isometric isomorphism from V to W is a surjective isometry u: V > W. 


Since an isometry is clearly bounded as well as injective, by Corollary B.35 a sur- 
jective isometry has a bounded linear inverse, which is easily seen to be isometric, 
too. In practice, it is the conditions in Definition B.36 that one typically checks. 
Nonetheless, the non-isometric case is also quite important. As a case in point, we 
prove a classical result of functional analysis, called the Closed Graph Theorem. In 
preparation, note that two normed spaces V, W define a third one, called their direct 
sum V ®W, which as a set is V x W, turned into a vector space by the operations 
(v1,W1) + (v2,W2) = (vi +v2,w1 +W2) and A(v,w) = (Av,Aw), etc., with norm 


II(v,w) Il = [vllv + Iw llw- (B.107) 


It is easily shown that if V and W are Banach spaces, then so is V OW. 
Furthermore, if a: V — W is any linear map, the graph of a is the vector space 


G(a) = {(v,av), vEV} CV OW. (B.108) 


If a is bounded, then G(a) is closed (i.e. in the norm inherited from the Banach 
space V @W). The converse, then, is the Closed Graph Theorem: 


Theorem B.37. Let V and W be Banach spaces and let a: V — W be a linear map. 
If the graph G(a) is closed (in the norm inherited from V © W), then a is bounded. 


Proof. Let b: G(a) > V be the linear map (v,av) ++ v, which is clearly a bijec- 
tion, with inverse b~! : V > G(a), b~'(v) = (v,av). Furthermore, ||b(v,av)|| = 
\lvlv < lvllv + llav|lw = ||(v,av)||, so b is bounded. Hence Corollary B.35 makes 
b—! bounded as well, i-e., ||b~!(v)|| <C]lv|ly for some C > 0. Hence ||(v,av)|| = 
lv|lv + |lav|lw < Cllvllv. So |lav|lw < (C—1)||v||v, and hence a is bounded. 


B.8 The Hahn—Banach Theorem 541 


B.8 The Hahn-Banach Theorem 


In this section we present another traditional pillar of functional analysis. 


Definition B.38. A sublinear functional on a real vector space V is a map p:V 
R that for each v,w € V and scalars t > 0 satisfies 


P(v+w) < p(v) +p); (B.109) 
p(tv) = tp(v). (B.110) 


We will deal with two examples of such functionals. One is simply a norm (even ona 
complex vector space, which in particular is a real vector space). For the other, recall 
that a subset K of a real vector space V is called convex if whenever v,w € K and 
t € (0,1), one has tv+ (1 —t)w € K. Even without a topology on V, we can define 
an interior point of K (or indeed of any subset of V) as a point v € K such that for 
each v’ € V there is € > 0 such that v+ tv’ € K for any 0 <t < €. We denote the 
set of interior points of K by int(K). For example, if V is normed (with associated 
topology), or is the dual of a normed space equipped with the w*-topology (or, 
even more generally, if V is a topological vector space, i.e., a vector space carrying 
a Hausdorff topology in which addition and scalar multiplication are continuous), 
then each point of an open set U is interior in the above sense, so that U = int(U). 
Let K CV be convex and suppose it contains 0 as an interior point. Then the 
indexfunctional!MinkowskiMinkowski functional (also called gauge) p : V > R* 
of K is defined by 
p(v) =inf{a>0|v/ae K}. (B.111) 


Note that p(v) < 9, because 0 € K is interior, so that there is € > 0 such that ev € K, 
and hence a = 1/€ lies in the set in (B.111). It is clear that if v € K, then a = 1 lies 
in the set in (B.111), so that p(v) < 1. As a simple example, for the (open or closed) 
unit ball B in a normed space (both of which are convex), we have p(v) = ||v||. 


Proposition B.39. Let K C V be convex and let 0 € K be an interior point of K. 
Then the Minkowski functional p of K satisfies (B.109) - (B.110). Furthermore, we 
may recover the set int(K) of interior points of K through 


int(K) = {v EV | p(v) < 1}. (B.112) 
Conversely, if some function p: V — R° satisfies (B.109) - (B.110), then the set 
K={veV| p(y) <1} (B.113) 
is convex, with interior given by (B.112). 


For example, if K is open (in a topological vector space), then (B.112) equals K. 


Proof. Given (B.111), eq. (B.110) is obvious. To prove (B.109), find a > 0 and b > 0 
such that v/a € K and w/b € K; cf. the comment after (B.111). Since K is convex, 
with t = a/(a+b) and hence 1 —t = b/(a+b) we have t-v/a+(1—t)-w/beEK. 
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Hence p(t-v/a+(1—t)-w/b) < 1, which, using (B.110), reads p(v-+w) <a+b. 
Taking the infimum over a and b constrained by v/a € K, w/b € K then turns the 
right-hand side into p(v) + p(w), so that we have proved (B.109). 

The proof of the converse claims is almost trivial, except perhaps for the last 
claim. To prove that p(v) < 1 implies v € int(K), we note that for any v’ € V and 
€ > 0, from (B.109) - (B.110) we have p(v + ev’) < p(v) + Ep(v’). If p(v’) = 0, this 
gives p(v + ev’) < p(v) < 1, so that v+- ev’ € K. If not, assume p(v) = 1— 6 for 
some 6 € (0, 1], and we find that p(v+ ev’) < 1 for any 0 < € < 5/p(v’). 


Having motivated Definition B.38, we now state the Hahn—Banach Theorem: 


Theorem B.40. Let V be a real vector space equipped with a sublinear functional 
p, and letW CV be a linear subspace carrying a linear map @w :W — R that is 
dominated by p in the sense that for each w € V we have Qw(w) < p(w). 

Then @w has a linear extension @ : V — R that for each v € V satisfies 


9(v) < p(v). (B.114) 
Proof. Take v1 € V, v1 ¢ W, and extend gy toW @R-v, by 
p(wt+tv1) = gw(w) +t@(v1), (B.115) 
with t € R and @(v;) to be determined. In order to satisfy (B.114), we need 


O(w+tv) < p(w+ty), (B.116) 


for each w € W and t € R. Using (B.110), this is true iff it is true for t+ 1, which 
yields two conditions (in two variables w,w’ € W), which may jointly be written as 


g(w’) — p(w’ —v1) < e(v1) < p(w+v1) — e(w). (B.117) 

Since @ is linear, this can obviously be satisfied by some @(v1) € R iff 
e(w+w’) < p(w+v1)+ p(w —v4), (B.118) 
which is indeed the case: for by assumption we have 9(w+w’) < p(w+w’), whence 
o(w+w’) < p(wt+v,+w —v1) < p(wtv1) 4+ p(w’ —V1), (B.119) 


where we used (B.109). Hence any choice of @(v;) that satisfies (B.117) provides 
an extension (B.115) of @ to W @R- v1, which by construction satisfies (B.114). 

Lovers of Zorn’s Lemma may now complete the proof as follows. Let F be the 
set of all pairs (9,X), where X C V is a linear subspace and g : X — R is a linear 
extension of @y that satisfies (B.114). We partially order F by 


(91,X1) x (@2,X2) iff X; C X and g (v) = @2(v) Vv EX. (B.120) 


Then F is clearly nonempty, and every totally ordered subset {(X,,@,)} of F has 
an upper bound (9,X), where X = U,X, and @(v) = @,(v) whenever v € X). 
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Thus Zorn’s Lemma applies, “giving” a maximal element (@,Z). If Z AV, one may 
extend Z by the first step of the proof (applied to W ~~ Z), contradicting maximality 
of (9,Z). Hence Z = V, and @ is the desired functional. 


If V is finite-dimensional, then Zorn’s Lemma is unnecessary, and a constructive 
proof may be given by repeating the first step of the proof a finite number of times. 


Corollary B.41. Let V be a normed vector space, with dual V*, and let W CV bea 
linear subspace (inheriting the norm from V, with associated dual W*). 
Then each @w € W* has an extension ~ € V* to V with the same norm. 


Proof. We take p(v) = ||Qw||||v||, which clearly satisfies (B.109) - (B.110). If V is 
real, Theorem B.40 gives g : V > R satisfying |@(v)| < ||@w||||v|| for each v € V, 
and hence || || < |j@w||- But | pw] < ||| since W CV, hence |jgl| = || gw |, 

If V is complex, we first regard it as a real vector space, take the real part pj, of 
(yw, and isometrically extend @}, to a linear functional @’ : X — R as above, so that 
\|0’|| = ||@iy||. Then define @ : X > C by 


e(v) = @'(v) —ig'(iv). (B.121) 


One checks that p((s + it)v) = (s+ it)@(v). Since @’(v) is the real part of g(v), 
with |@(»)[? =|9'(v)[? +0" (iv)|?, we have [0 (v)| <|9(v)| and hence ||| < l/l] 
Conversely, for any v with @(v) 40, take z = |@(v)|/@(v), so that |@(v)| = (zy). 
Hence @(zv) is real and therefore it is equal to its real part, so that, since |z| = 1, 


(ev) = (ev) < e'Iillevll = le'IIllvIl. 


Therefore, ||@|| < ||@’||, and hence ||@|| = ||@’||. The same computation applies to 
ow, yielding ||ow || = l|o¥y||, so that finally |||] = [||| = lle%yll = llowll. 


In fact, this trick to pass from the real to the complex case was overlooked by Hanh 
and Banach themselves, whose arguments were much more involved. 

As to Zorn’s Lemma, if V is infinite-dimensional but still separable, using (count- 
able) induction one may construct a sequence (v,) of linearly independent unit vec- 
tors in V\W, such that V is the closed linear span of W and the v,. The above 
procedure then gives @ in the real algebraic linear span of W and the v,, which is 
bounded by construction and may be extended to all of V by continuity. However, 
the construction of (v,) still requires a weaker form of the Axiom of Choice (which 
is equivalent to Zorn’s Lemma), namely the so-called Axiom of Dependent Choice. 

In the situation of Corollary B.41, the extension @ is unique iff the normed space 
V is strictly convex, which by definition means that its unit sphere is strictly convex, 
ice., if ||v|] = ||w]| for v A w and t € (0, 1), then ||tv + (1 —t)w]| < 1. Equivalently, if 
\|v|| = ||w|| = 5||v-+w]], then v = w. This is the case, for example, in Hilbert spaces 
H, as easily follows from the comment after (A.3). Indeed, anticipating Theorem 
B.66, if W C H is closed (as we may assume, since My is continuous), we may 
identify Py : W — C with some vector My € W, and if we do, the unique extension 
@~ : H — Ccorresponds to the same vector Qw, now regarded as an element of H. 


544 B Basic functional analysis 


Corollary B.42. Let V be a normed vector space, with dual V*, and fix some 
nonzero vector vg € V. There exists a functional @ € V* such that 


9(v0) = |Ivolls (B.122) 
lel] = 1. (B.123) 


Proof. Take W = C- vo in Corollary B.41, so that || @y|| = 1 by construction. 


We now turn to an application of Theorem B.40 to convexity theory, which we 
will need for the Krein—Milman Theorem (and hence eventually for the existence of 
pure states on C*-algebras). Although we will apply the lemma below to the dual of 
a normed vector space in its w*-topology, the setting is more general; all we need is 
a few easily established facts for topological vector spaces V, namely that if U Cc V 
is open, then so is every translate U + v of U, and so is €U, for any € > 0, and 
hence also (—e€U) M (€U). Furthermore, a linear map @ : V — R is continuous iff it 
is continuous at 0. These elementary facts will be used in the proof below. 


Theorem B.43. Let V be a real topological vector space and let A and B be dis- 
joint nonempty convex subsets of V, with A open. Then there is a continuous linear 
functional 9 : V — Rand some t € R such that p(a) <t < 9(b) for allac A,b EB. 


Proof. From C =A—B = {a—b|a€A,b © B}, which is convex and open (as it 
is a union of open sets A+ b over b € B). Then move C so that it contains 0, by 
taking any ap € A and bo € B and defining K =C + vo, with vo = bo — ap. Thus K 
has its associated Minkowski functional px, cf. (B.111). Noting that vo ¢ K (since 
ANB =9), we have px(vo) > 1. With W = R- vo, define a functional gy :W > R 
by @w(svo) =s for s € R. This implies @y(v) < px(v) for v € R- vo: if v = svg with 
s > 0, this is obvious from (B.110) and @w(vo) = 1, and if s < 0, then @w(v) <0 
whereas px(v) > 0. We now use Theorem B.40 to extend @y to a functional 9 : 
V > R satisfying (B.114), which implies @(v) < px(v) < 1 for any v € K. Taking 
v=a—b+vVo gives p(a) < g(b) for any a € A,b € B. Taking t = inf{ @(d) | b € B}, 
the last claim of the lemma follows. Finally, since @(v) < 1 for each v € K, we have 
gp '(—e€,€) C (—€K) (eK), which is open, so that @ is continuous. 


This is the precise result we will need, but variations abound. If A and B are 
open, in which case @(B) is open, we have @(a) <t < 9(b). If V is locally convex, 
in that its topology has a basis consisting of convex sets, then if A is closed and 
B is compact, there are disjoint open convex sets A’ and B’ containing A and B, 
respectively, so that also in this case we obtain the strict inequalities just mentioned. 

Finally, even if V has no topology, we can still show that @(a) <t < @(b) on the 
mere assumption that A has an interior point (@ then lacks continuity, of course). 

Result like this are often called separation theorems. Namely, a plane H in R? 
always takes the form xo + kerg = @~!(c), where x) € R* and 9g: R? > Risa 
(nonzero) linear map. Equivalently, H = @~!(c), where c = @(xo). More generally, 
a hyperplane in a vector space V is a (nonempty) subspace of the form H = g'!(c), 
where @ is a linear functional on V; clearly, H has codimension one and if V is a 
topological vector space and @ is continuous, then H is closed. So Theorem B.43 
shows that A and B are separated by the closed hyperplane H = 07 '(t). 
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B.9 Duality 


We now turn to duality theory. For any normed (but not necessarily complete) vector 
space V, Theorem B.33 shows that the space V* of all morphisms @: V > Cis a 
Banach space, called the dual of V. By (B.99), the norm of @ € V* is given by 


||| = sup{|e(v)|,v € V, |lvllv < 1}. (B.124) 
Any morphism a € B(V,W) induces a dual morphism a* € B(W* ,V*) by 
(a*Q)(v) = (av), PEW. (B.125) 
By definition of the various norms involved here, we find 
lla*|| = sup{|e(av)|,9 € W*,v €V, | Ql] = |lvl| = 1. (B.126) 
Since |@(av) < ||@||||av|| < |la||, this immediately yields 
I|a"|| < lal]. (B.127) 


In fact, one even has 
Ila" || = lal, (B.128) 


but unexpectedly heavy machinery (namely the Hahn—Banach Theorem) is required 
to prove this. By Corollary B.42 (applied to W), for any v € V, there exists @ € W* 
with ||@|| = 1 and @(av) = |lav||, so from (B.126) we have ||a*|| > ||av|| for any v EV 
with ||v|| = 1. Taking the supremum over such v and using (B.99) gives ||a*|| > |lal|. 
With our earlier (B.127), this gives (B.128). 

Another application of Corollary B.42 lies in the double dual V** = (V*)*. 


Proposition B.44. For any normed space V, the map v ++ ¥ from V to V*™*, given by 

(9) = @(r), PEV’, (B.129) 
is isometric (and hence injective), mapping V onto a closed subspace V CV**. 
This will follow from part | of the following consequence of Corollary B.42: 


Corollary B.45. Let V be a normed vector space, with dual V* 


I. For any v € V, one has 


Ilvll = sup{le()|,@ €V", llell = 1. (B.130) 


2. For any w # v, there exists 9 € V* with @(w) 4 @(v). 
3. For any a € B(V,W), we have 


\|a|| = sup{|t(av)|,v €V,t € W%, ||v|| = |||] = 1}. (B.131) 


Proof. This is the proof of Corollary B.45. 
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1. If ||@|| = 1, then |@(v)| < ||v||, so the supremum is < ||v||. But according to 
Corollary B.42 the supremum is > ||v]]. 

2. Take vo = v—w in Corollary B.42 and use the previous item. 

3. Apply part 1 in W to |av|| and use (B.99). 


Proof. And this is the proof of Proposition B.44. Note that ||?|| < ||v||, since 


[>| = sup{le)|],9 eV", ell =U, (B.132) 


and |@(v)|| < || @||{|v]] = ||v||. Corollary B.42 shows this bound is saturated. 


If V is finite-dimensional, Proposition B.44 gives a natural isomorphism V** = V, 
in contrast with the “unnatural” isomorphisms V* = V that require the choice of a 
basis (this terminology is made precise in category theory, see Appendix E). 

In addition to their (metric) topology coming from the norm, both V and V* natu- 
rally carry another topology (which will be of great importance in operator algebras 
and hence in quantum theory), defined in an almost identical way: 


e The weak topology on V is the weakest topology that makes all functions @ : V > 
C continuous, g € V*. Equivalently, one has convergence v, — v (of sequences, 
or, more generally, of nets) iff @(v,) > @(v) for each g € V*. 

e The weak* topology (or w*-topology) on V* is the weakest topology that makes 
all functions * : V* —+ C continuous, v € V. Equivalently, it is the topology of 
pointwise convergence, in that @, — @ iff @(v,) > @(v) for each v € V (etc.). 


As their names suggest, these topologies are weaker than the norm topologies (ex- 
cept when V is finite-dimensional): indeed, if ||v, —v|]| + 0 and @ € V*, then cer- 
tainly |@(vn) — @(v)| < ||@l||lvn — v|| > 0, and similarly for V*. Consequently, a 
functional @ : V — C is norm-continuous if it is weakly continuous, but the con- 
verse may be false. Nonetheless, the weak dual of V coincides with its norm dual, 
and we combine this with a contrasting result for the weak* continuous functionals 
V*, which en passant locates the image V of V in V** under (B.129): 


Proposition B.46. e Any functional @ € V* is weakly continuous. 
e A functional @ € V* is weak* continuous iff 0 € V. 


We just mention that, because of Corollary B.45.2, this proposition is a special case 
of a very general result on topological vector spaces. Namely, let V and W two 
vector spaces in separating duality, that is, there is a bilinear form 


(-,-):VxWocC 


such that for each v,v’ € V there is w © W with (v,w) 4 (v’,w), and for each 
w,w’ € W there is v € V with (v,w) 4 (v,w’). Then V can be given the so-called 
o(V,W )-topology, which is the weakest topology making each map v +> (v,w) con- 
tinuous (w € W), and W likewise carries the o(W,V) topology (sometimes also 
called the o(V,W)-topology). In particular, the weak topology on V is just the 
o(V,V*)-topology, whereas the weak* topology on V* is the o(V*,V)-topology. 
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Theorem B.47. Let V and W be vector spaces in separating duality. The space of 
0(V,W)-continuous linear functionals on V coincides with W, and likewise, the 
space of o(W,V)-continuous linear functionals on W coincides with V. 


This follows from elementary topology, and hence omit the proof. From this point of 
view, the apparent difference between the two parts of Proposition B.46 originates 
in the fact that the weak* topology on V* is defined by its separating duality with V 
(or, equivalently, with V), rather than its separating duality with V**. 

Next, the Banach—Alaoglu Theorem shows an unexpected but important prop- 
erty of the weak* topology (a least when V is infinite-dimensional). For example, in 
quantum theory this theorem implies w*-compactness of the state space, and this, in 
turn (through the Krein—Milman Theorem), leads to an abundance of pure states. 


Theorem B.48. [f V is a normed vector space, any d-ball 

Vi ={9 EV", |lQl| < a} (B.133) 
is compact in the weak* topology. More generally, if U is any neighborhood of 0 in 
V, the set Vu = {9 € V*,|@(x)| <dVv € U} is w*-compact. 


Clearly, U = V yields (B.133). Omitting the proof, we just note that the first claim 
is based on the fact that V;; is a closed subset of the space 


[]fz€Cl lel < liv}, 


veV 


which is compact by Tychonoff’s Theorem in topology (such reliance on awful non- 
constructive results is unfortunately typical of traditional functional analysis). 
After this abstract theory, it is high time to turn to some examples; see Table B.1. 


No.| V v* V*-V-pairing comment 

1. |Co(X M(X (uf) = fyduf X locally compact Hausdorff space 

2. |C,(X M(BX) (uf) = fyduf BX Cech-Stone compactification of X 

3. | Lo (X Ox (f,8) = Lxex f(x) g(x) X countable set, £o(N) often called co 
4.10% &°(X) (f,8) = Lxex f(x) g(x) X countable set 

5. |€°(X) |ba(X, P(X) )c]  (u,g) = fydug —_|bounded finitely additive signed measures on X 
6. | P(X LUX (f,8) = Lxex f(x) g(x) TL T=1, p,q# 1,~, X countable 

7. (2 (xX 02(X (f.g) =Laex f(x) g(x) £? treated as a Hilbert space 

8.| A el (f,8) = (fg) H general Hilbert space 

9. |12(x 12(X (f,8) = Sy du fe L? treated as a Hilbert space 

10. [L(x L*(X (f,g) = fy du fg (X,2,~) o-finite measure space 
11.|LP(X)| LX (f.8) = Sy du fe T+i=LpgFle 
12.|Bo(H Bi(H (p,a) = Tr(pa) Bo(H) compact operators, B;(H) trace class 
13.|B\(H B(H) (a,p) = Tr (pa) B(H) bounded operators on H 

14| M, M (a, Q) = Qa) M,, predual of von Neumann algebra M 


Table B.1 Some Banach spaces and their duals, up to isometric isomorphism 


548 B Basic functional analysis 


1. The first entry is Theorem B.24. 


2. This one is true by definition if we define the Cech-Stone compactification BX 
of a locally compact (Hausdorff) space as the Gelfand spectrum of C,(X) as a 
commutative C*-algebra, or, equivalently, by 


Cp(X) = C(BX); (B.134) 


The compact Hausdorff space BX then has the feature that each f € C,(X) has 
a unique continuous extension to BX. More generally, let X be a topological 
space. Provided it exists, “the” Cech-Stone compactification of X, denoted by 
BX, is a compact Hausdorff space together with a continuous map By : X > BX 
such that for each compact Hausdorff space K and each continuous function 
f :X — K there is a unique continuous function Bf : BX — K such that the 
following diagram commutes: 


Xx we Bx 
! (B.135) 
AS [00 
K 


This universal property makes BX unique up to homeomorphism (if it exists). 
If X is locally compact Hausdorff, then BX exists and By is injective, making 
By (X) = X a dense subspace of BX. The above diagram then implies (B.134) 
through f +> Bf; just take K = Ran(f)~, which is compact since f is bounded. 

Specializing this case to arbitrary sets X seen as discrete topological spaces, we 
can give an explicit description of BX as the set of all ultrafilters on X. 


Definition B.49. Let X be any set (seen as a discrete topological space). 


e A filter on X is a non-empty collection F of subsets of X such that A € F and 
BEF implies ANB € F,A€ F andA CB implies B € F, and finally 0 ¢ F. 

e An ultrafilter is a filter that is maximal in the set of all proper filters F (i.e. 
F 4 PY(X)\O), ordered by inclusion. It is straightforward to show that a filter 
F is maximal iff one and hence all of the following equivalent conditions hold: 
a. for any A CX we have either A € F or A‘ € F; 
b. if AUB EF, thenA € F or BE F (i.e., F is prime); 
c. fANB#O for all B € F, thenA € F. 

e Forany x €X, the set U, of all subsets of X that contain x forms an ultrafilter, 
called principal; any ultrafilter not of this kind is called free (if |X| = °, the 
existence of free ultrafilters on X follows from Zorn’s Lemma). 


For discrete X, the set of all ultrafilters on X, endowed with the topology gener- 
ated by all sets of the form U4 = {U € BX |A CU}, where A CX, is a realization 
of the Cech-Stone compactification of X, and may therefore be denoted by BX. 
Note that each U, is clopen in BX. The embedding By maps x € X to the principal 
ultrafilter U,, and the continuous extension B f of f : X — K is given by 
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BS(U) =limf = () f(Ay (B.136) 
ACU 
Theorem 4.24 then explains the pairing in no. 2 of Table B.1 (see also no. 5). 

3. e This is a special case of no. 1, since £9(X) = Co(X), given that X is discrete (as 
a topological space). We then use the (Lebesgue—) Radon—Nikodym Theorem 
of measure theory: if (X, 2,1) is a o-finite measure space and v is a complex 
measure on & that is absolutely continuous with respect to pl (i.e., U(A) = 0 
implies v(A) = 0, A € 5), then there is a function dv/du € L'(X) such that 


dv x 
[ara [aus f EL” (X). (B.137) 


In the case at hand, X is countable and y is the counting measure, with respect 
to which any measure is absolutely continuous. This yields M(X) = ¢'(X). 
e Secondly, this duality is also a special case of Theorem B.27: as in (B.78), 


lo(X) = €°(X, Pp(X)), (B.138) 
so that bounded hermitian functionals @ : &9(X) — C (which in this case cor- 


respond to bounded real-linear functionals ¢o(X , IR) —> R) are given by 


= ) = Jim f aus, 


Nn—yoo 


where g € £9(X), (s,) is a sequence in Step(X, Y/(X)), which simply consists 
of functions on X with finite support, and yp is a finitely additive bounded 
signed measure on Y¢(X ), which is given by its values on any singleton x € X 
and hence is just a real-valued function 


f(x) = BCLx})s (B.139) 


boundedness of 1 gives f € ¢'(X). Writing X = U,X,, where the X,, are finite 
and X, C X;41 (¢.g., for X = N one may take X, = {1,...,n}, so that Yycy, = 
Yr=1), We may use Ss, = fix, on X, and Sp (x) = 0 outside X,,, which gives 


(g) = lim }' f(x)g(x) = LU f(@)s@). (B.140) 
xEXy, 


xEX 
One easily verifies that indeed || f||1 = ||L||, since (B.56) - (B.57) yield 
|| || = sup{H4(A) + (A) |A € Ap(X)} = sup Vif@lae ane} 
xEA 


whose right-hand side in turn is equal to || f||1. 
e As athird approach, we give a direct proof of the desired duality 


fo(X)* & £1 (X). (B.141) 
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To start, for f € ¢'(X) and g € o(X), we define an expression f(g) by 


(sg) = (f.8) = YS f(e)g(x). (B.142) 


xEX 


By the obvious estimate 


lpr(s)l < [IF llillgll, (B.143) 


which is Hélder’s inequality for p = 1 and g =~, the sum (B.142) is abso- 
lutely convergent, and hence defines a linear map @¢ : £o(X) > C, which satis- 
fies || ¢|| < || f\|1. Thus the map f +> gf is well defined from ¢!(X) to £o(X)*. 
To prove surjectivity of this map, for given @ € £9(X)*, define f : X > C by 


F(x) = (6x). (B.144) 


It follows from continuity of @ that @ = gy, cf. (B.140), but it remains to be 
shown that f € ¢'(X). To do so, for each n € N we define @, : o(X) > C by 


On(g) = ¥ F(x)a(x). (B.145) 


xEXy 


This operator is bounded, with 


I Pal] = [Isnlli, (B.146) 


where s, was defined prior to (B.140). To see this, we have 
nll < Wsnll, (B.147) 


from (B.143), whereas the opposite inequality follows from a trick: define 


n(x) = f(x)/|F%)| & € Xn, f(x) 40); (B.148) 
8n(x) = 0 (otherwise), (B.149) 


so that, assuming @ 4 0, we have ||g,||.0 = 1 and @n(gn) = ||sn|l1, and hence 
llPnll = Ilsnlli- (B.150) 


Since (g) = @¢(g) is finite by assumption, as in (B.140) limy_.. Qn (g) exists 
for each g € lo(X). Hence limps. || @n(g)|] exists, so sup, {|| @n(g)||} < o. 
The Principle of Uniform Boundedness (cf. Theorem B.78 below) then gives 
sup, {|| Qn||} < 0°, and this supremum equals sup, {||5,||} = || f|l1- 


Comparing the first two approaches, we see that bounded finitely additive mea- 
sures on #¢(X) bijectively correspond to bounded o-additive measures on 
P(X), both of which in turn are given by positive functions f € ¢!(X). 
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4. This is similar to the third proof of the previous case. For f € €°(X) and g € 
£'(X), we define g¢(g) by (B.142), and instead of (B.143) we now obtain 


lpr(g)| < Wfllellglli- (B.151) 


Thus we have a map f +> Of from £°(X) to £)(X)*, satisfying ||@|| < || fl... 
To prove surjectivity, for some @ € ¢'(X)* we once again define f : X —> C by 
(B.144), so that @ = gy by continuity. Then for any x € X,we obtain |f(x)| < 
| olliSelli = lll, 80 [fle < loll and hence ||| = ||flla- In particular, f € 
¢*(X) and the bijection @y <> f gives an isometric isomorphism a la (B.141): 


1(X)* & &°(X). (B.152) 


0° (X)* © M(BX); (B.153) 
0° (X,R)* & ba(X, P(X)), (B.154) 


cf. no. 2, and Theorem B.27, respectively. Thus bounded finitely additive mea- 
sures on X (with underlying semiring 4 = Y(X)) bijectively correspond to 
bounded 0-additive measures [1g on BX (equipped with the Borel o-algebra) by 


[amr= fi dupBs (B.155) 


for any f € €*(X). This is not as surprising as it seems, because there is a bijec- 
tive correspondence between ultrafilters U on X and finitely additive probability 
measures UU on X that take values in {0,1}. This correspondence is given by: 


U = {ACX|p(A)=1}; (B.156) 
u(A) =1 iffAcU. (B.157) 


Principal ultrafilters U, thereby correspond to Dirac measures 6, on X, whereas 
free ultrafilters U correspond to (finitely additive) measures Wy on X that vanish 
on any finite subset of X. For general ultrafilters U € BX we have, for f € £°(X), 


[aut= far, (B.158) 


ACU 


where f(A) = {f(x) |x © A} as usual, and f(A)~ is the closure of this set in C. 
Thus (B.158) is equal to the unique z € C with the property that for each e > 0 
the set {x € N: | f(x) —z| < €} lies in U; for U = U,, this recovers z = f(x). 

6. This is similar to nos. 3 and 4, but is slightly more involved. For f € ¢4(X) and 
g € lP(X), with (B.16) and p,g 4 1,-, we again define @¢(g) by (B.142), upon 
which Hélder’s inequality yields || F|| < ||f||~. Conversely, for p € €?(X)*, once 
again define f by (B.144), so that @ = pf. We now show that || f||¢ < ||@||. 
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Pick X, C X as defined below (B.139), and define f, : X > C by fy,(x) = f(x) if 
x€X, and f,(x) =Oifx ¢X;. If || f\|q <, then || ||, = sup, || fn||g- Now define 


&n(x) = |falx)4/ fax) (fnlx) 4 0); (B.159) 
&n(x) = 0 (falx) =0). (B.160) 


Using (B.142), we obtain 


I fallall fall! = I fall = Fas8n) = (x) < llollllgallp =llellilfallg!, (B.161) 
whence |) filly < |||. Taking sup, gives || lly < |||. and hence 
0P (X)* © £4(X). (B.162) 


7. p =q =2 stands out as a special, self-dual case. As the next item explains, 
this is because (7(X) is a Hilbert space with inner product (B.10). This differs 
from the pairing (B.142) by the complex conjugation of the first term, making 
it appropriate to redefine the pairing between ¢?(X)* and ¢7(X) in terms of the 
inner product. This leads to an antilinear isometric isomorphism ¢?(X)* © ¢?(X), 
as opposed to the /inear isometric isomorphisms for all other values of p,q. 


8. Proposition A.5 generalizes to infinite-dimensional Hilbert spaces (in which case 
it is often named after Riesz and Fréchet), with the following additions to the 
proof. First, boundedness of f guarantees that ker(f) is a closed subspace of H, 
so that (if f 40) the orthogonal complement ker(/)+ is not empty by Proposition 
B.57 below. Second, uniqueness of the representing vector y in (A.13) now needs 
to be shown. This is easy: if (w,@) = (w’,@) for all @ € H, then, taking 9 = 
y — y’, it follows that (y— y’,y— yw’) =||w—w’||? =0, hence y’ = w. 


No. 9 follows from no. 8, whilst 10 and 11 are similar to 4 and 6, except for some 
tricky measure-theoretic details. We only sketch the main idea (where for simplicity 
we assume Uw is finite; using an approximation procedure the result is valid also 
for the o-finite case, but not beyond!). Namely, the function f representing the 
functional @ € L?(X)* is constructed by first defining a complex measure v on © 
by v(A) = @(14), A € X. Using (B.85), we see that v is absolutely continuous with 
respect to LM, and we put 

f=dv/du. (B.163) 


Using definition (B.29) of integration, this yields 


9(g) = (f.g) = I du fe, (B.164) 


and similar arguments as in the discrete case show that f € L1(X). 

Nos. 12-13 follow from Theorem B.146 below, and no. 14, which is forward- 
looking, too, is true by definition of the predual of a von Neumann algebra (whose 
existence is highly nontrivial); see Theorem C.132 in Appendix §C. 
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B.10 The Krein—Milman Theorem 


Returning to the abstract theory, we now apply the Hahn—Banach Theorem and du- 
ality theory to prove one of the most beautiful results in functional analysis. 
The boundary 0,K of a convex set K consists of all v € K satisfying: 


if v =tw + (1—t)x for certain w,x € K andt € (0,1), thenv=w =x. 


Hence Caratheodory’s Theorem 1.12, which, we recall, states that if K is anonempty 
compact convex subset of R”, then 0.K 4 @, and each point of K is a convex sum 
of at most 1 + 1 points in 0,K, implies, in particular, that 0,K is not empty. This is 
readily visualized: the simplest example is K = [0,1], where 0.K = {0,1}. One also 
has triangles in the plane, whose boundaries consist of their vertices (rather than 
their sides, which are among their faces, see below). Furthermore, the closed (unit) 
three-ball B? in R? is convex, with boundary 0B? = S?, cf. Proposition 2.9. In these 
examples the interior of K, which is still convex, would have an empty boundary, so 
that the assumption of compactness in Theorem 1.12 is absolutely essential. 
Caratheodory’s Theorem follows from a straightforward induction argument in 
the dimension of K, and the following Krein—Milman Theorem. The convex hull 
co(X) of a subset X of a vector space is defined as the set of all convex sums tx + 
(1—t)y, where t € (0,1) and x,y € X; this is the smallest convex set containing X. 


Theorem B.50. Let V be a real normed vector space with dual V*, and let K be a 
convex subset of V* that is compact in the w*-topology. Then 0.K #0, and each 
point of K lies in the w*-closure of the convex hull of 0.K. In other words, 


K =(co(@K))~. (B.165) 


Zorn’s Lemma will be used twice in the proof: both directly and through Theorem 
B.43, which relies on the Hahn—Banach Theorem B.40, whose proof uses Zorn. 
Furthermore, a face of a convex set K is anonempty convex subset F C K such that: 


Ifz=tx+(1—t)y forz € F witht € (0,1) and x,y € K, then x,y € F. 


In particular, each extreme point x € 0K is a face in its own right; conversely, a face 
consisting of a single point lies in 0,K (as should be clear from the definitions). 


Proof. 1. Let ¥(K) be the set of all closed faces in K, partially ordered by in- 
verse inclusion, i.e., fF, < Fo iff Fy C F,. The intersection of any finite subset of 
a totally ordered subset {F,} of ¥(K) is obviously nonempty, so that, by com- 
pactness of K, we also have N, Fy, 4 9. (Proof by contradiction: if Ny Fy, = 9, then 
Un Fy = (MF, )° = 0° =K, so that {Ff} is an open cover of K, which definition 
of compactness has a finite subcover {F{,}. By the same argument, Ny’ Fy = 0.) 
Hence M, Fy is an upper bound of {F,}, so that Zorn gives us a (not necessarily 
unique) maximal element Fo in -¥(K) (which set-theoretically is minimal be- 
cause of the reverse ordering, i.e., Fo contains no strictly smaller closed face). 
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2. We now show that Fo must be a singleton (and hence an extreme point of K). 
For any v € V, the function %: V* — R defined by #(@) = @(v) is w*-continuous, 
see Propositions B.44 and B.46. Since Fo C K is compact, 9 assumes a minimum 
on Fo, say m. The set 

Fin = {9 € Fo | 0(P) =m} (B.166) 


is not only closed (by continuity of ¥), and hence compact (since F is), but it is 
again a face in K: first, if @ € F,, takes the form 


~ =19 + (1-1), (B.167) 


with @1, Q2 € Fo, then 
6(@) =m =10(g) + (1 —£)9(@), (B.168) 


which, given that $(@;) > m, is only possible if 9(@1) = 0(@2) = m, so that @; € 
F,. Hence F,, is a face in Fo, but this implies that it is equally well a face in K. 
Namely, if (B.167) holds for @ € F,, and @; € K, then regarding @ as an element 
of Fo gives @; € Fo, because Fo is a face in K, upon which the previous step, 
where we regard @ as an element of Fj, gives Qj € Fin. 

Since Fo is maximal, we must have F,, = Fo, so that each functional ? is constant 
on Fo. Now we know (even without the Hahn—Banach Theorem) that the func- 
tionals # separate points in V*, since the very statement that @; 4 @) means that 
there is some v € V such that @1(v) 4 @2(v) and hence #(@1) 4 P(@z). So if Fo 
contains more than one point, there must be a functional ? that is not constant on 
Fo. Hence Fo is a singleton, and therefore an element of 0K. That is, 0.K 4 0. 


3. The same argument applies to any closed face F in K, showing that each F € 
A (K) contains at least one point in dF. But such a point is a face in F and 
hence in K, and being a one-point face in K, it must lie in 0.K. So we may 
strengthen the previous point by concluding that F 10.K # 0 for any closed face 
FCK. 


4. To prove (B.165) by reductio ad absurdum, define 
B= (co(d-K)) , (B.169) 


and assume B + K. First note that co(d.K) is convex by construction, and that 
its closure B remains convex (because the vector space operations, and a fortiori 
the convex sums, are continuous). Its complement in V* is open, and hence any 
point @ € K\B has an open convex neighbourhood A C V*\B (see below), which 
is therefore disjoint from B. Hence Theorem B.43 applies (with V ~» V* and 
@ ~» 9), giving us v € V andt € R for which 0(@) <t < 0(B) for any B € B. 

Now define s = min{¥(@) | @ € K}, which exists since K is w*-compact and f is 
w* continuous. Since @ € K\B C K and i(a@) <t, we have s < t. Subsequently, 
define F; = {@ € K| 9(@) =s}. As in step 2 above, it follows that F; is a closed 
face in K. According to step 3, there is a point @ € F,0-K, so that $(@) =s. 
This contradicts s << t < 9(B) for any B € B,as@ € O.K CB. 
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The existence of A in step 4 above arises from the fact that open sets of the form 


68, ={9 EV", |o(v)| <e(i=1,...,n)}, (B.170) 


where € > 0 and all v; € V, form a basis of w*-neighbourhoods of 0 € V*, and hence 
its translates @ + of) v, form such a basis for any @ € V*; the point is that such 


sets are convex, because if |@;(v)| < € fori = 1,2 andt € (0,1), then 
+A —-He)M| SHaM+A-Hlew)|<@+l-He=e. (B71) 


Although the Krein—Milman Theorem is of considerable interest and beauty in 
itself, our main use of it lies in a few corollaries. Among those is Choquet’s Theorem 
in the next section, but we first turn to the Stone—Weierstrass Theorem: 


Theorem B.51. Let X be a compact Hausdorff space. Let B be an involutive subal- 
gebra of C(X) (regarded as a commutative C*-algebra) that separates points on X 
(i.e., ifx #y there is f € B such that f(x) 4 f(y)) and contains the unit function ly. 
Then B is dense in C(X) in the sup-norm. In particular, if B is closed, then B=C(X). 


In other words, B is a linear subspace of C(X) such that if f,g € B, then fg € B, 
and if f € B, then f* € B, where f*(x) = f(x). Furthermore, C(X) and hence B are 
equipped with the sup-norm. The assumptions could even be weakened: instead of 
asking that ly € B and that B separate points, for the proof we just need that for 
each x,y € X and s,t € R there is f € B such that f(x) =s and f(y) =t. 


We are going to derive Theorem B.51 from Theorem B.50 and the following: 


Lemma B.52. Let B be a linear subspace of some Banach space V. Then B is dense 
in V iff the only element @ € V* that satisfies p(v) = 0 for allv € B is p =0. 


Proof. The “=>” direction (which will not be needed) is immediate from the fact that 
© € V* is bounded and therefore, if v = limvg for (vz) in B, then g(v) = lim (vq), 
so that @(v) = 0 for all v € B implies @(v) = 0 for all v € V and hence 9 = 0. 

Conversely, if B~ #V, we will exhibit some nonzero @ € V* with @jz = 0. Take 
some w ¢ B~ and define W CV byW =C-w+B _, along with a map gw :W > C 
given by Qw(Aw+v) =A for any A € C and v € B’. This map is trivially linear, 
as well as bounded: since w ¢ B” we have ||w — v|| > d for some d > 0, for each 
v € B; since then also —v € B-, we have ||Aw+ || > |A|d, and therefore 


|ov (Aw +v)|=|A| <d“'||Aw+y]). 


By Corollary B.41, our My extends to some 9 € V", with Oz = Pwig = 9. 


Proof, We now prove Theorem B.51. We define a subspace B° C M(X) by 


B= {we M(X) |u(f*) =H(P), lI < L.u(f) =OVES € B}, (B.172) 


where f*(x) = f(x) as usual. Our aim is to show that 


B° = {0}. (B.173) 
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Since any @ € M(X) is a multiple of some wp in the unit ball ||u|| < 1, eq. (B.173) 
gives the antecedent of the “<=” part of Lemma B.52, which gives Theorem B.51. 
Noting that the w*-topology in M(X) is just the topology in which uw, > uU 
iff u,(f) > w(f) for each f € C(X), we see that B° is closed in the unit ball of 
M(X), so that it is w*-compact by the Banach—Alaoglu Theorem. Furthermore, B° 
is convex, so the Krein-Milman Theorem gives 0.B° 4 0. Any 1 € 0,B° has either 
|| || =, in which case (B.173) holds and we are ready, or, as we assume in what 
follows, 
lel] = 1. (B.174) 


Indeed, if 0 < ||1|| < 1, then 
M=th + (1—t)b, (B.175) 


with ¢ = |||, UW) = W/||U||, and UW = 0 would give a nontrivial decomposition of LU. 
For g € C(X), define 


Ly: M(X) — M(x); (B.176) 
Leu(f) = (sh), (B.177) 


or “Ledu = g- dw”. It follows from the assumptions on B in Theorem B.51 that if 
0 <g< ly and g € B (as we will now assume), then L, maps B® into itself, and also 
0 < ly —g < ly. Hence Lj, _, maps B® into itself. Given (B.174), we then have 


[Lay —ghl] = 1 — [Leu (B.178) 


This follows from (B.76): the Hahn-Jordan decomposition (B.55) of U also gives 
(Lef)+ = Lous and (Li, —.U)+ = Liy—gpl+ (since g > 0 and 1x — g > 0), so that 


[Lay ght] = Liye Ht (X) + Liy—eht-(X) (B.179) 
= p(X) +H (X) — Leb (X) — Legh (X) = [let —[|Leutll. (B-180) 


Because of (B.178), we obtain a convex decomposition (B.175) with t = ||Lg||, 
My = Lep/||Lep||, and ws = Ly, ght /||Li,—gu||, which are well defined because 
of (B.174), which guarantees that the two denominators are nonzero. Since [ is 
extreme by assumption (i.e., it lies in 0-B°), it must be that 


Lol Lty-gh 
Leu] ||L1x — gu 


LU. (B.181) 


Hence g(x) = ||Z,U|| almost everywhere with respect to U; in particular, this must 
hold for each x € supp(). Suppose there are at least two different points x,y € 
supp(1). Since B separates points and contains ly, we can easily find 0 < g < ly 
such that g(x) 4 g(y), contradicting constancy of g on supp(i1). So supp(w) = {x}, 
which, given (B.174), implies that u = +6,, so that u(1y) =+1. Since ly € B, this 
contradicts (B.172). Hence (B.174) leads to a contradiction, and we are left with the 
other possibility ||U|| = 0. This gives u = 0, that is, (B.173). 
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B.11 Choquet’s Theorem 


Choquet’s Theorem B.53 beautifully follows up on the Krein—Milman Theorem. 
To state it, we need the support supp(U) of a measure Lt on a space X, defined 
as the smallest closed set F such that u(X\F) = 0, or, equivalently, as the largest 
closed set F such that each open neighbourhood U of each x € F has strictly pos- 
itive measure U(U) > 0, provided such a set exists. This is the case, for example, 
if X is locally compact Hausdorff and p is (inner) regular. To see this, let {U, } 
be set of all open U, € G(X) such that “(U,) = 0, and let U = U, Uy. By inner 
regularity, u(U) = sup{u(K) | K CU,K € #(X)}. Since each such K is compact, 
K CU? ,Uj,, whence u(K) <); U(Uy,) =0. Hence u(U) = 0, and supp(u) =X \U. 


Theorem B.53. In the notation of Theorem B.50, for each @ € K there is a proba- 
bility measure tt on K whose support is contained in 0eK~ such that for eachv € V, 


(v) = i (0) 0(0). (B.182) 


Moreover, if K is metrizable, then the support of Lt may be restricted to 0K. 


Here 0.K~ = (0-K)~ is the closure of 0K; in many examples (e.g., state spaces of 
C*-algebras of infinite quantum systems), 0.K is not closed or even Borel. 

Reading (B.182) from right to left, the point @ € K is called the barycenter of LU. 
Preparing for the proof, we note that if X is a compact Hausdorff space, the dual 
C(X)* of C(X) as a Banach space (in the sup-norm) is the space M(X) of all com- 
plete regular complex measures on X; cf. Theorem B.24. The set My (X) of all 
complete regular probability measures on X is a closed subset of the unit ball of 
M(X), since ||u|| = w(X) =1 if uw € Mf (X), cf. (B.54), and hence Mj‘ (X) is w*- 
compact by the Banach—Alaoglu Theorem. We will use these facts with X = 0.K~. 

We also recall that a (not necessarily continuous) function f : K — R is affine if 


f(tgi + (1-1) 2) =tf(@1) +) F (2), (B.183) 


for t € (0,1) and @; 4 @ € K, concave if one has > instead of = in (B.183), convex 
with < instead of =, and strictly convex if (B.183) holds with =~» <. 

For example, f(x) =x? is strictly convex on [—1, 1]. The assumption of metriz- 
ability will only be used to prove the existence of a strictly convex continuous func- 
tion on K, so this existence could have been assumed instead of metrizability. Fi- 
nally, we denote the space of real-valued continuous affine functions on K by A(K). 


Proof. By Theorem B.50, @ = lim@,, where (@,) is some net in co(d-K), so 


that = Yip)!” 


L 


, where the sum is finite, p”? > 0, and rp” = |. Then 


by =y; P54) is a probability measure on 0,K and hence also on its (compact) 


closure 0.K~. Since My (0K ~) is w*-compact, the previous net has a subnet that 
w*-converges to some ft € Mj‘ (d-K~). Noting that p(v) = @(v), where § € V** is 
w*-continuous by Proposition B.46, this tt by construction satisfies (B.182). 


558 B Basic functional analysis 


We now prove the last claim. If K is metrizable, then C(K) is separable, so that 
its subspace A(K) is separable, too. Thus we can find some countable dense subset 
(fn)n>o of A(K), in terms of which we define a function fo : K > R by 


ee (B.184) 


fol) = 2" + 1) fa(@) 


First, continuity of fo follows from uniform convergence of this series and continu- 
ity of each fy; recall that A(K) C C(K,R). Second, the x” example just given implies 
that if f € A(K), then f? is convex, and it is even strictly convex provided there is 
at least one n > 0 for which f,(91) 4 fn(@2). To show that this is the case, we note 
that since V C V** separates points in V* and each § € V™ defines an element of 
A(K) by restriction, A(K) separates points in K. Therefore, by density of the family 
(fn), the claim follows, and fo is strictly convex. This will be crucial. 
For each real-valued f € C(K,R), define the concave envelope f by 


f(g) =inf{g() |g € A(K),g > f}- (B.185) 


The terminology comes from the fact that f < f for any f © C (K), with equality if 
f is concave; this is because for any continuous concave function f we may write 


f(@) =inf{g(@) [hE A(K),g > f}- (B.186) 


In terms of this, for any fixed element @ € K we define p : C(K,R) > R by 


P(f) =f(@o). (B.187) 


Since fte < f+ and if =tf for t > 0, as is easily verified, it follows that p is 
sublinear (cf. Definition B.38). We define a linear subspace W C C(K,R) by 


W =A(K)+R- fo, (B.188) 


endowed with the ‘hatted’ evaluation map éVg, : W > R defined by 


Wy (g+sfo) = (Go) +5fo(@o); (B.189) 


since g = g for any g € A(K), for s > 0 we have €Vg,(g+sfo) = eve, (8+5fo)- 

It is easy to show that p dominates évg,, so that the Hahn—Banach Theorem 
B.40 yields an extension Ve of Vg, to C(K,R) that satisfies Ven (f) < f(@o). 
This implies that Won is positive; to see this, take f < 0. Since the zero function 
is in A(K) we have f < 0 also, so that Wy, (f) <0. Passing to —f, we find that 
Vos (f) > 0 whenever f > 0. Furthermore, since 1x € A(K) C W, we have 


Bq, (1k) = Vqy (1k) = 1k (Go) = 1. 
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Therefore, Voy is a state on C(K). Corollary B.17 then turns CVn into a probability 
measure Lt on K. Taking f = 9 for some v € V, we have f € A(K) C W, so that 


[du(o) o(v) = [due =u(6) =e (0) = 0G) = 9o(v). (8.190) 
This is almost (B.182) with @ ~» @o; what we still need to prove is the property 
supp(tt) C 0K. (B.191) 
This will be proved in two steps. For any f € C(K), we define K(f) C K by 


K(f) ={9 €K | f(@) =f(@)}- (B.192) 


We will separately show that 


supp(H) © ee (B.193) 
K(fo) © a (B.194) 

Towards (B.193) we start showing that 
(fo) = H(fo), (B.195) 


which is a conjunction of u(fo) < (fo) and (fo) > (fo). The first is true for any 
f © C(K), since p is positive and f < f (pointwise). The second is specific to fo: 


LU(fo) = Vp (fo) = Eq (fo) = fo( Go) 

= inte) |g €A(K),g > fot 
= inf{u(g) |g € A(K),g = fo}, (B.196) 
since for g € A(K) we have g(@) = L(g) because A(K) C W. If in addition g > fo, 
we have g > fo, which implies u(g) > (fo). This inequality survives the infimum 


in (B.196), so that we finally obtain (fo) > (fo), and hence (B.195). 
We now prove (B.193) from (B.195). Since fy < fy, for each n > 0 we may define 


= {PER | fo(g) — fo(y) = 1/n}. (B.197) 


Then 0 < (Kn) <n: fydu (fo — fo), which vanishes by (B.195). Hence (K,) =0 
for each n, and therefore U(U,K,) = 0. But U,K, = K(fo)°, so (B.193) follows. 
Eq. (B.194) is equivalent to the inclusion (0,K)° C K(fo)°, i-e., the implication: 
if Pp =1@, + (1 —t)@ for some t € (0,1) and g 4 @, then fo(@) F fo(@). 


Indeed, strict convexity of fy (used at last!) and the familiar property fo < fo give 


fo() = inf{tg(@,) + (1 —t)g(@2) | g € A(K),g > fo} 
> tinf{g(pi) | g € A(K),g => fo} + (1 —1)inf{e(@2) | g € A(K),g = fo} 
= tfo(1) + (1 —1)fo(@2) = tfo(1) + (1 —1) fo(2) > fol). 
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In turn, the existence of some measure pW in (B.182) representing an arbitrary 
point @ € K implies the Krein—Milman Theorem. We rewrite (B.182) as 


(o) = | du, (B.198) 
OeK~ 


where @ € K is arbitrary and ? € C(K) is the (affine) continuous function on K C 
V* induced by the functional * € V** on V* defined by v € V under the canonical 
injection V > V**, v4 9, see Proposition B.44. From (B.198) and (B.34) we obtain 


PCP) <I, 
which, because d-K~ C (co(d-K))~, also gives the inequality 
[P(p)| < [ofr 


This forces @ € (co(0-K))~, for if @ ¢ (co(deK))~ we would obtain a contradiction 
with Theorem B.43 (which is a version of the Hanh—Banach Theorem), or more 
precisely, with the alternative version thereof stated after its proof, with A = {@} 
closed and B = (co(0.K))~ compact and convex (and, of course, @ ~» —@). There- 
fore, K C (co(d.K))~, which implies (B.165). 

If only to illustrate Choquet’s Theorem, we note that existence of the probability 
measure [ in the Riesz Representation Theorem B.15 follows from it. To see this, 
fix some compact Hausdorff space X, and take V = C(X,R) (as a real Banach space 
in the supremum-norm) and K = S(C(X,R)) C V*, i.e., the set of positive linear 
functionals g : C(X,R) > R that satisfy @(1x) = 1. By the argument following 
Definition 1.14, K coincides with the state space S(C(X)) of the commutative C*- 
algebra C(X ), which is a complex Banach space (cf. Appendix C), in that each g € K 
extends uniquely to a state @ : C(X) > C by complex linearity, which extension 
remains positive in the sense of Definition C.3. From Propositions C.14 and C.19, 
the map X > V* given by x +> ev,, where ev,(f) = f(x) is the evaluation map at x, 
takes values in 0,K and yields a homeomorphism 


O.K =X. (B.199) 


In particular, 0.K is closed in V* (and in K), so (B.182) comes down to (B.39). 

The part of Theorem B. 15 that does not follow from Theorem B.53 is the possible 
uniqueness of the measure Lt on 0,K~ that represents the point @ € K. Uniqueness 
of the measure in Choquet’s Theorem is settled by the following notion. 


Definition B.54. A (Choquet) simplex is a compact convex set K C V* whose as- 
sociated convex cone K = R* -K = {tw |t > 0,@ € K} (cf. Definition C.50) is a 
lattice in the partial ordering < defined by p < 6 iffo-—p EK 


Here we assume that for any p € K there is a unique ¢ € R* and @ € K such that 
t@ = p; this is the case if K = KH for some closed hyperplane H in V* that does 
not contain the origin. For example, if K = S(A) is the state space of some unital 
C*-algebra A, then H = {9 € A* | p(14) = 1} and K = {g € A* | > 0}). 
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In finite dimension, Choquet simplices are special convex polytopes called sim- 
plices. Recall that the so-called regular polyhedra were classified (up to affine iso- 
morphism) by Schlafli in 1852, who showed that the only possibilities are: 


The simplices A, = {x € R"+! | x; > 0,Y;x; =1},n> 1; 

The cubes Q, = {x € R"| -l<x<1},n>1; 

The cross-polytopes On = {x € R" | Y;|xi| < 1},2 > 1; 

The countably many regular polygons in R? (which include Q7,02,A2); 
The five Platonic solids in R> (which include Q3, 03,43); 

The six regular polychora in R* (which include Q4, 04, Aq). 


An n-dimensional simplex is affinely homeomorphic to the convex hull of n+ 1 lin- 
early independent points (or, equivalently, |0.K| =n-+ 1). In particular, the simplex 
A, is the set Pr(n+1) of all probability distributions on a set X = n+ 1 of cardi- 
nality n+ 1, cf. Definition 1.9. Generalizing this idea, if X is a compact Hausdorff 
space, then the state space S(C(X )) of the associated commutative C*-algebra C(X), 
which as we know consists of all probability measures on X, is a Choquet simplex. 
In the notation of Theorem B.53, the simplest result (again due to Choquet) is: 


Theorem B.55. Suppose K is metrizable, and assume supp(U) C 0-K in (B.182). 
Then w is uniquely determined by its barycenter @ iff K is a Choquet simplex. 


However, we note that without any assumption on K, conversely the barycenter @ 
for which (B.182) holds for all v € V is uniquely determined by w. This observation 
gives rise to a map B from the compact convex set M(K Fa of all probability mea- 
sures on K to K itself, such that B() is the unique point in K such that (B.198) with 
@ = B(u) holds for all v € V. This map B is, in fact, affine as well as continuous. 

Theorem B.55 covers finite phase spaces in classical mechanics as well as, 
negatively, finite-dimensional Hilbert spaces in quantum mechanics: in the for- 
mer case, any state admits a unique decomposition into pure states (cf. Proposi- 
tion 1.13), whereas in the latter this fails. For example, for H = C?, the state space 
S(B(H)) & B (see Proposition 2.9) is not a simplex. See also Proposition 2.14. 

To explain the general (i.e., non-metrizable) case, we first define the Choquet or- 
dering ~< on the set of probability measures on K by u ~ v iff u(f) < v(f) for any 
convex function f € C(K,R). Noting that B(u) = B(v) whenever  ~ v, the idea 
is that since the values of convex functions almost by definition increase towards 
the boundary 0,K, probability measures on K with given barycenter that are maxi- 
mal with respect to < should be supported on 0,K (such maximal measures always 
exist by a Zorn’s Lemma argument). This intuition is indeed correct, provided K 
is metrizable, in which case, conversely, the condition supp({1) C 0.K in Theorem 
B.55 forces pt to be maximal. In general, an alternative way to prove the first part of 
Theorem B.53 would be to take some maximal ft with given barycenter pL. 

The key to the generalization of Theorem B.55 to the possibly non-metrizable 
case, then, is to replace the assumption supp() C 0K by maximality of y. This is 
achieved by the major Choquet-Meyer Theorem, which we state without proof: 


Theorem B.56. Assume the measure w in (B.182) is maximal with respect to ~. 
Then w is uniquely determined by its barycenter @ iff K is a Choquet simplex. 
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B.12 A précis of infinite-dimensional Hilbert space 


The main difference between infinite-dimensional Hilbert spaces and their finite- 
dimensional counterparts lies in issues of convergence and completeness. Every 
linear subspace of a finite-dimensional Hilbert space is automatically complete (cf. 
Proposition B.5), and all sums one encounters are finite. In infinite dimension, ¢.(N) 
is a linear but incomplete subspace of ¢7(N), and similarly for C.(IR) C L?(IR); the 
expansion of some vector in terms of a basis already involves an infinite sum. 

Note that in metric spaces a subset is closed iff it is sequentially complete (in 
that it contains all limits of Cauchy sequences); this can be seen from the fact that 
the metric topology is generated by €-balls and hence by (1/n)-balls, n € N. Con- 
sequently, in Banach spaces (and hence in Hilbert spaces) H, the property of some 
subspace L C H being (metrically) complete (in the sense that every Cauchy se- 
quence in L converges to an element of L) is the same as L being (topologically) 
closed (in the sense that the set-theoretic complement L° is open). Following tradi- 
tion in functional analysis, we will henceforth speak of closed subspaces. We denote 
the (metric or topological) closure of SC Hin H by S~. 

An exhaustive way of guaranteeing that some linear subspace L C H is closed is 
to exhibit it as an orthogonal complement L = S, where S C H is any subset: we 
write y | S iff (vy, yw) =0 for each y € S, and, as in (A.29), put 


S'={weH|w St}. (B.200) 


We also use the double orthogonal complement S++ = (S+)+, et cetera. 
Proposition B.57. Let H be a Hilbert space. 


1. If S CH is any subset, S* is a closed linear subspace of H. 
2. For each closed linear subspace L C H, one has 


H=LoL-, (B.201) 


in the sense that 


LNL* = {0}, (B.202) 
and each vector W € H has a unique decomposition 
vawlty, (B.203) 


where wl € Land wt €L*. 
3. For any closed linear subspace L one has L++ = L. 
4. For any linear subspaces L, one has 


Li=L, (B.204) 


and hence L~ = H iff (w, @) = 0 for each @ € L implies y = 0. 
5. For any subset S C H, one has Stit = st, 
6. For any subset S C H, the closure [S|~ of the (finite) linear span |S] of S is S++. 
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Proof, 1. Linearity of S+ follows from linearity of the inner product. If wy, € S+ 
and y,, — y, then for 7 € S and each n, we have 


(%.V)1 = 1, ¥— Vn)! <M IIlv— Yall. (B.205) 


Taking n — © gives (y, yw) =O and hence y € S*, so that S+ is closed. 

2. The proof of the infinite-dimensional case (cf. Corollary A.9 for finite dimension) 
relies on Riesz Lemma B.58 below, which explains why L needs to be closed, and 
also neatly identifies yl as the unique vector in L at minimal distance to y. 
Granting this important lemma, let y € H, we take 


C=W+L={yv+o,o9EL}. (B.206) 


Lemma B.58 yields a unique vector 7%o € C, from which we define yl =YVW-X 
and w+ = Yo (so that || y!! — w|| = ||zol| is minimal). Then w!! € L, and (B.203) 
holds by construction. To show that 7% € L+, we rewrite the inequality ||o|| < 
|v + 9|| (for all @ € L) as ||Zoll < Zo +], since y= yo+ wl and wl EL. 
Putting @ = —((C,%0)/||¢||*)¢, with € € L arbitrary (but nonzero), the last in- 
equality reads 0 < —|(f,%0)|?/||6||?, whence (€,70) =0 for all € € L, so that 
2X0 € L+. Uniqueness of the decomposition (B.203) follows as in Corollary A.9. 

3. Trivially, L C L++. To prove the converse inclusion, use the previous item. 

4. If AC B, then B+ C A+ and hence At+ Cc B++. With A = L and B = L-, this 
gives L'+ C (L~)++ =L~ (where, L~ being closed, we used the previous item). 
Conversely, L C L-~ and hence L~ € L++, since L++ is closed by the first item. 

5. Take L = S+ and use the third item. 

6. Proceeding as in the proof of no. 1, from the continuity of the inner product we 
find S+ = ([S]~)+, and hence, using no. 3, finally S++ = ([S]~)++ = [s]-. 


Lemma B.58. The norm assumes a unique minimum on any closed convex set C C 
H (i.e., there is a unique Xo € C such that ||%o|| < ||x|| for each x © C, x # Xo). 


Proof. Let w = inf{||x||,7% € C}, which exists, as ||7%|| > 0. Hence there is a mini- 
mizing sequence (7,,) in C with ||7,|| > 1, which we now prove to be Cauchy (in 
H). Since C is convex, 4(%n+%m) € C, and therefore, ||%+ X%m|| > 2u. Thus 


2 
| 


O < [lan — mill” = 2(laall” + xml”) — ln + Xml” < 2 (all? + Ulamll?) — 4", 


and since 2(||%n||* + ||¥mll?) + 4u? as n,m —> c0, we must have ||%n — Xm|| — 0. 
Since C is closed, ¥, — 70 for some Xo € C. To prove uniqueness, let another mini- 
mizing sequence (xj,) converge to 7) € C. Then $(%¥0 + %4) € C, so we obtain 


I|¥o + Xol| = 2u = ||Xoll + llxoll- 


The inequality ||%o + All < llxoll + llxAll gives llzo + Xbll = Izoll + llxéll. ie. 
Re(x},X0) = Ix) ll|Zol|- Cauchy-Schwarz. gives |(14,%0)| < l|x4lll|zoll with equal- 
ity iff v/ and Yo are proportional, so the previous equality can hold only if 74 = txo 
for some t > 0. Since XO and Yo both minimize the norm, we have t = 1. 
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We now turn to the important concept of a basis of a Hilbert space; as in the 
previous appendix, a basis of a Hilbert space always denotes an orthonormal basis. 
To define this notion, we first say that some subset {;}jc; of H is orthonormal if 


(vi, Oj) = 643 (B.207) 


this condition guarantees that the v; are linearly independent (and easy to calculate 
with!). Second, in finite dimension (where J must be finite) we may simply define 
a basis of H as an orthonormal set that is also a basis in the usual (linear algebra) 
sense. This idea remains valid for general Hilbert spaces, except that we should use 
Definition B.6 to define infinite sums (and Lemma B.7 to analyze them). Theorem 
B.61 to come gives an exhaustive account of the situation, but we first need a lemma 
on general orthonormal sets (that do not necessarily form a basis). 


Lemma B.59. /f { v;}ie7 is an orthonormal set in H and c; € C, then the sum 


y= Vici (B.208) 


iel 
converges in H (in the sense of Definition B.6) iff 


Y |i? <e. (B.209) 


iel 
If this is the case, the coefficients c; € C are given by 
ci = (0;, ). (B.210) 


Proof. The first claim follows from Proposition B.8 and the elementary computation 


IY civil? = ¥ levi? = ¥ lei? <e, (B.211) 
i€G’ 


iEG! i¢G’ 


where G’ is finite, so that the sums Y,<;c;; and Yj<;|c;|? either both exist (ie., 
converge) or both do not exist. When J is countable this follows more simply by 
noting that );cnc;v; converges iff (s,) is a Cauchy sequence, where s, = V1 ciD;, 
and computing ||5n —Sin||7 = Le jn41 cil’, where n > m. To prove (B.210) on the 
assumption that (B.208) exists, by the Cauchy—Schwarz inequality, for any € > 0, 


(vj, W) — ej] = |(0j, W— yi civj+ Y' civi) 6; 


icG icG 
= |(vj,w— Yo civi) < |lvlillv— Yo civil] <e, 
icG icG 


where we used Definition B.6 as well as ||v;|| = 1. Letting € > 0 yields (B.210). 


Lemma B.60. Let {v;}ic7 be an orthonormal set in H. We have Bessel’s Inequality 


Yiu, WP < lly? (ved). (B.212) 
ie] 
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Proof. For any finite G C J, a computation based on (A.2) yields 


Yl. w)P = Ill? —llw— LY (a, wy vill? < |v. (B.213) 


ic¢G icG 


It follows that also the supremum of the left-hand side over all finite subsets G C I 
is bounded by ||w||? and hence is finite. By Lemma B.7, this supremum equals 
Vier |(vi, v)|?, which gives (B.212). 


Theorem B.61. Let B = {v;}ie, be an orthonormal subset of a Hilbert space H. The 
following conditions are equivalent (and each defines B to be a basis of H): 


I. Any w € H can be written (in the sense of Definition B.6) as W = Vie) ciVi- 
2. For each w € H, one has Parseval’s equality 


Y lo, w)? = [ly r- (B.214) 
ic] 
3. For any YW, € H one has 
(9, Vv) = (9, vi) (vi, Y). (B.215) 
ic] 


4. B is not properly contained in any other orthonormal set (i.e., B is maximal). 
5, BL = {0} 

6.3 = 
7. The Pine of the linear span of B is H. 


Note that (B.215) is used in almost every computation in quantum physics, in which 
one also typically has ||y|| = 1. In that case, (B.214) at least formally turns the 
\ci|? = |(v;, w)|? into (Born) probabilities, as discussed throughout the main text. 


Proof. Assuming (B.208) and hence (B.210), take € > 0 and find F C X (finite) so 
that || W— Viegcivil| < €. By (B.213), this gives 


¥ l(a, ))? = ll? < e. (B.216) 
icG 
Hence (B.214) holds in the sense of Definition B.6 (with V = C). Conversely, as- 
suming (B.214), eq. (B.213) gives (B.208). This proves the equivalence 1 <> 2. 
Clearly, (B.214) is a special case of (B.215), which in turn follows from (B.208) 
with (B.210) and continuity of the inner product, whence 3 — 2 and | — 3. 
Furthermore, | — 5 follows by contradiction: given (B.210), any nonzero vector 
w € B* could not possibly be written as (B.208). Conversely, 5 > 1 most easily 
follows by contradiction, too. For any y € H, the sum @ = Yje;(V;, YW) D; exists in H 
by Lemma B.59. Continuity of the inner product yields (v;,@) = (v;, y) and hence 
(v;,9— y) =0 for each j € J, whence gy — y € B". If @ cannot be written in the 
form (B.208) we have » 4 y, so B+ ¥ {0}, which is the desired contradiction. 
Finally, 4 < 5 is tautological, 5 <> 6 is trivial, and 6 <> 7 is a special case of 
Proposition B.57.6 (hence this proposition is needed only for no. 7). 
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For example, if H = 2 (S), then one may take J = S, with v, = 6,. Since S is an 
arbitrary set, this example shows that any cardinality of J may, in principle, occur. 
The existence of a basis has a remarkable consequence, for which we need: 


Definition B.62. Two Hilbert spaces H, and Hp are called isomorphic, written 
A = Ay, if they are isometrically isomorphic, that is, if there is an invertible linear 
map u € B(H,,H2) such that 


uw lin = [wll (YE 1). (B.217) 


By Theorem A.3, a specific surjective isometry u : H; —> Hy implementing an iso- 
morphism is automatically unitary, in that it is surjective and satisfies 


(UW, u) A, = (W,®)n,- (B.218) 


Conversely, a unitary map is an isometric isomorphism, so that isometric isomor- 
phism of Hilbert spaces (seen as Banach spaces) is the same as unitary isomorphism. 
The following theorem (due to von Neumann, who was a specialist in both Hilbert 
space theory and axiomatic set theory) shows that the classification of Hilbert spaces 
up to isomorphism reduces to the classification of sets up to bijection. 


Theorem B.63. /. Any Hilbert space has a basis. 

2. All bases of a given Hilbert space H have the same cardinality (which is then, 
consistently, called the dimension of H). 

3. Two Hilbert spaces are isomorphic iff they have the same dimension. 

Specifically, clause 2 states that if (v;)je; and (Vi )ies are both bases of H, then / =J 

as sets (i.e., there is a bijection J > J). Similarly, clause 3 states that H; = A iff Hy 

has a basis (v;)iey and Hp has a basis (v}) jes for which J = J. 


Proof. 1. The general proof is, alas, based on Zorn’s Lemma: the collection O of 
all orthonormal sets in H is ordered by inclusion and each totally (i.e. linearly) 
ordered subset has an upper bound, namely its union. Hence O has a maximal 
element, which is a basis by Theorem B.61.4. Fortunately, in case that H is 
(topologically) separable (in that it contains a countable dense subset), a ba- 
sis may be constructed by the well-known Gram—Schmidt procedure, as fol- 
lows: let (Wi, Y,...) be a countable subset of H, for simplicity already taken to 
be linearly independent (otherwise, remove linear combinations first), start with 
vi = yw) Jw], inductively define wy, = Wy =P (Vi, Wa) Di, n€N, which 
already yields an orthogonal set, and finally normalize to Vp = Wn/||wall- 

2. We only prove the case where one basis, say {v;}ie;, is finite in somde detail. 
Take another basis {v'} jes. From (B.214) and (B.215), 


= Yi? =P Ye oj, v1)? = PY (vj, vi) (v1, 0§) = Y° llojll? = VI. 


iel iel jet iel jed jel 


A similar computation excludes the possibility that J is countable and J is not. 
The general case relies on some cardinal arithmetic, which we spare the reader. 
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3. Let {U;}icr be a basis of H and let {v;} jc; be a basis of H’. Assume I = J, so 
that there is a bijection b: J > J. Define u: H > H' and v: H' > H by linear 
extension of uv; = Vj” ) and Vv; = Vp-1(;)s that is, 


i 


uy =P (vi,W) OA) = YE (Vp) W)VF5 (B.219) 
ie. jcd 

ww = LOL Vy 1) = Lj WY 0, (B.220) 
Jed ic] 


where in each line the first equality sign is the definition of the map, whilst the 
second is a useful rewriting. These maps are well defined by Lemma B.59, e.g., 


Yl =), WP? = 2 Kewl = ||yl|? <©, (B.221) 


icd 
so that the sums in (B.219) converges, and likewise for (B.220). Furthermore, 


(uy, up)’ = YEW, Vi, ) (Vin P) (Vpci4)1 Pb). = LW, D1) (0:,®) = (W, 9), 


11,12 l 

where we used (B.207) for the primed basis, and (B.215). Similar computations 
establish (vy’,vg’) = (w’,’)', so that (in view of their obvious surjectivity) u 
and v are both unitary, as well as uv = 1, and vu = 14. Thus H = H’. 
Conversely, if H (with basis {v;};<;) and H’ are isomorphic, so that there is a 
unitary u: H + H’, then {uv;}ie7 is a basis of H’, hence J even equals /. 


Corollary B.64. If {v;}jc7 is a basis of H, then H = (1). 


Proof, Define u: H — ¢7(1) by linear extension of wv; = 6;, where i € J. 


Corollary B.65. A Hilbert space is (topologically) separable iff it either has a 
countable basis, or is finite-dimensional. 


Proof. One direction of the proof is the Gram—Schmidt procedure (since the given 
countable dense set contains a basis). Conversely, if {v;} is a countable (or finite) 
basis of H, then the complex rational linear span of this set, i.e., the set of all finite 
linear combinations )°;c;v; with c; € Q+ iQ, is countable as well as dense in H. 


In particular, any finite-dimensional Hilbert space is isomorphic to C” with standard 
inner product, and any separable Hilbert space is isomorphic to (7(N); when speak- 
ing of a separable Hilbert spaces we actually tend to think of the infinite-dimensional 
case. Although at first sight separability appears to be a rather restrictive condition, 
in fact the non-separable case only appears in some weird proofs in the theory of 
operator algebras (as well as in the theory of almost continuous functions in the 
sense of H. Bohr). Indeed, every Hilbert space naturally occurring in applications to 
mathematical physics (or to partial differential equations) is separable. 
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B.13 Operators on infinite-dimensional Hilbert space 


The fact that all (infinite-dimensional) separable Hilbert spaces are isomorphic sug- 
gests that the riches of the theory are not be found in the spaces themselves, but 
in the operators that act on them (whose explicit form typically depends on some 
concrete realization of H, like (?(N), or L?(IR®), etc.). The simplest operators are 
functionals, i.e., linear maps f : H — C, and the main new feature compared to the 
finite-dimensional case is that f is no longer necessarily bounded, see §8B.9. The 
nature of bounded linear functionals, i.e., elements of the dual H%, is totally settled 
by the Riesz—Fréchet Theorem (which we already know; cf. Proposition A.5 and 
nos. 6 and 7 in Table B.1 in §B.9), showing that little is gained by looking at them. 


Theorem B.66. Let H be a Hilbert space. The map W'+ fy from H to H*, where 


fu(@) =(W9), (B.222) 


is an isometric anti-linear isomorphism H — H*. 


Proof. For convenience we rewrite (B.124) for the case at hand as 


fll =sup{If(W)|, ¥ € A, llwlla < U- (B.223) 


Since | fy(@)| = |(W.@)| < || ¥||||@|| by Cauchy—Schwarz, it follows that fy € H* 
for any y € H, with || fy|| < || y||. We may sharpen this to equality, i.e., 


Ilfvll = yl, (B.224) 


by choosing f = fy and @ = y in (B.223). Hence yt+ fy is isometric and there- 
fore also injective. To prove surjectivity, we find a vector y for which some given 
nonzero functional f equals fy (of f = 0, then y = 0 does the job). Assume f 4 0 
(otherwise, yw = 0 does the job). Then ker(f)+ 4 {0}: namely, ker(f) is closed 
by continuity of f and is linear by linearity of f, whence ker(f)++ = ker(f) by 
Proposition B.57.3, so that (arguing by contradiction) ker(f)+ = {0} would imply 
ker(f)++ = H and hence ker(f) =H, or f = 0. 

The remainder of the proof is the same as for Proposition A.5. 


This allows one to make the weak topology on H (or, equivalently, the weak* 
topology on H*) explicit (cf. §B.9): we have y, > w weakly iff (9, yw, — yw) > 0 
for each @ € A (and similarly for nets). From the general theory, or directly from 
Cauchy—Schwarz, it is immediate that (at least for infinite-dimensional H) the weak 
topology on H is indeed weaker than the strong one (that is, strong convergence im- 
plies weak convergence), but not the other way round. A simple example is provided 
by any ordered countable basis (Vj; )ncn of a separable Hilbert space, where v, — 0 
weakly but not strongly for any n € N (more generally, for any infinite-dimensional 
Hilbert space and any basis {v;} we have v; > 0 weakly but not strongly in the 
sense of convergence of nets). Nonetheless, as a corollary of Proposition B.46: 


Corollary B.67. The functional fy defined by (B.222) is weakly continuous. 


B.13 Operators on infinite-dimensional Hilbert space 569 


We now move from functionals als special operators from H to C to operators in 
the usual sense, i.e., linear maps from H to itself. Once again, the main new feature 
compared to the finite-dimensional case is that a linear map a: H — H is no longer 
necessarily bounded, where (cf. Definition B.32) we recall that a is bounded if it 
satisfies one (and hence both) of the following equivalent conditions: 


lay|| <Clly|| (we A); (B.225) 
sup{|law||,w ¢ AH, ||w|| <1} <2. (B.226) 


In that case, the (finite) supremum is called the norm ||a|| of a, exactly as in (A.18). 
Using Theorem B.66 and (B.130), we therefore have 


l|a|| = sup{llaw||, w <A, ||w|| = 1} (B.227) 
= sup{|(9,aw)|, Vv, € A, ||y|| = |e|| = 1}, (B.228) 


and we have the inequalities (A.20) and (A.21), as in the finite-dimensional case. 

It is clear from (A.20) and (B.225) that bounded operators a are continuous, 
in that if yw, > yw, then ay;,, — ay. On the other hand, unbounded operators are 
discontinuous in this sense: for each n € N there is YW, € H with ||y,|| = 1 and 
||ay,|| >. The sequence (W,, = y;,/n) then converges to zero, but since ||a,|| > 1, 
the sequence (aW,) does not converge to a- 0 = 0. Thus on infinite-dimensional 
Hilbert spaces a sharp distinction emerges between bounded and unbounded opeta- 
tors. 

Among the former, we will distinguish between compact operators and the rest, 
whilst among the latter, one has the closed operators (i.e., those with a closed 
graph), which are still reasonably well-behaved, and the (non-closed) rest. Yet cut- 
ting through the bounded-unbounded divide is the notion of self-adjointness. For 
any linear (not necessarily bounded) map a: H — H, we say that a is self-adjoint if 


(ap, y) =(9,aW), (W.9 €H). (B.229) 
The remarkable Hellinger—Toeplitz Theorem then states that such maps are bounded: 
Theorem B.68. [f a linear map a: H — H satisfies (B.229), then it is bounded. 


Proof. The proof is based on the Closed Graph Theorem B.37. If the sequence 
(Wn,@W,) in G(a) CH @H converges, say to (y,@~) ¢ HOH, then y, > yw and 
ay, — Q. Using (B.229) and continuity of the inner product, for 7 € H we have 


(X,9) = lim(Z,aWn) = lim(az, Wn) = (aX, W) = (x,4Y).- 


For ¥ = 9 —ay, this yields @ = ay, and hence (w,@) € G(a). This means that 
G(a) is closed, upon which the Closed Graph Theorem states that a is bounded. 


More generally, if V and W are Banach spaces, with dual spaces V* and W*, respec- 
tively, and two linear (but not a priori bounded) maps a: V + W and b: W* > V* 


570 B Basic functional analysis 


satisfy (av) = (b@)(v) for each v € W and @ € W*, then a and b are bounded, with 
b =a", as defined in (B.125). The proof is similar. 

This generalization of Theorem B.68 also places the familiar adjoint a* from 
Hilbert space in broader perspective: making the identification fy <+ y of H* with 7 
described by the Riesz—Fréchet Theorem B.66, the Banach space definition (B.125) 
of the adjoint a* : H* — H* of a bounded linear map a: H — H reproduces the 
definition (A.15) of the Hilbert space adjoint a* : H — H. Thus we also infer that 
(B.128) is valid for arbitrary Hilbert spaces. Note that in the Hilbert space case, 
boundedness of a* may be proved more simply, as follows. 


Proposition B.69. Let a € B(H) and let a* : H + H be its adjoint, that is, 
(a" v9) =(W,a@) (Y,9 €H). (B.230) 
Then a* is bounded, with |\a*|| = |lal]. 


Proof, Eq. (B.230) gives |(a" y,@)| < lall|ylllg||. Taking @ =a" y yields lla" yl < 
l|a|||w|, and hence |la*|| < ||a||. Replacing a by a* gives the last claim. 


Since unbounded self-adjoint operators a : H — H do not exist, von Neumann 
defined such operators on some (proper) linear subspace D(a) C H (always assumed 
to be dense in H), called the domain of a. This affects the definition of the adjoint: 


Definition B.70. 7. The adjoint a* of an operator a: D(a) > H has domain D(a*) C 
H consisting of all y © H for which the functional fy, : D(a) — C, defined by 


Sy(@) = (Wag) (9 € D(a), (B.231) 


is bounded, i.e., there is C > 0 such that |fG(@)| < C||g|| for all p € D(a). 
2. For yw € D(a’), the functional fi, has a unique bounded extension fy, : H — C, 
so by Theorem B.66 there is a unique vector ' € H such that fy(@) =(W',@). 
3. The adjoint a* : D(a*) CH, then, is defined by a* yw = w", or, equivalently, by 


(a"y,@) = (W,ag), we D(a’), g € D(a). (B.232) 


Note that, on our assumption that D(a) be dense in H, i.e, D(a)~ = H, eq. 
(B.232) indeed uniquely specifies a* W because of Proposition B.57.4. 
4. An operator a: D(a) > H is called self-adjoint when D(a*) = D(a) and a* =a. 


If D(a) = H, and a is bounded, then also D(a*) = H, since | fy,()| < llal|ll wlll ell. 
so that fy, is bounded for any y € H. Accordingly, for a € B(H), Definition B.70 
reduces to the usual definition (A.15). Furthermore, even if D(a) is merely dense 
in H, if a: D(a) — H is bounded in the sense of (B.225) - (B.226), but now with 
y € D(a) instead of y € H, then a has a unique extension to a a bounded operator 
a: H — H, whose adjoint a* may be either defined through Definition B.70 as the 
adjoint of a: D(a) > H, or, equivalently, as the adjoint of the extension a: H > H. 

Here, as well as in Definition B.70.2, a general Banach space principle is at work: 
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Proposition B.71. Let V and W be Banach spaces, and let V’ be a dense subset 
of V. Any bounded linear map a’: V' — W (in the sense of Definition B.32) has a 
unique bounded linear extension a: V — W, with |la|| = ||a’ ||. 


Proof. For v € V there is a sequence (v,) in V’ with v, — v. Since a’ : V’ > W is 
bounded and (v,) is convergent in V’ and hence Cauchy in V, also the sequence 
(a'v,) in W is Cauchy. Since W is assumed complete, we may define av = lim, a'vpy. 
This limit is easily seen to be independent of the approximating sequence to v, and 
the ensuing map a: V — W is clearly linear. Furthermore, since by (B.5) we have 
||v|| = lim, ||v,||, if we assume ||v|| = 1 we can take v,, to have unit norm also. 
Once again from (B.5), we also have ||av|| = lim, ||a’v»|| < sup, ||a’vn||, whence 
\|a|| < |la’||. But for v € V’, taking v, = v we have a’v = av, and hence the bound 
lja’v|] < |jal||v||, from which |ja'|| < |jall, so that finally |Jal] = |’). 


To complete these basic definitions, we say that an (unbounded) operator a : 
D(a) > H is closed if its graph G(a) = {(w,aw), y € D(a)} is a closed subspace 
of H @ HA, cf. (B.108). Note that in the Hilbert space case it is more appropriate to 
replace the norm (B.107) on H @A by the equivalent norm 


(wll = V Iv? + [lwll?, (B.233) 


since this alternative norm comes from the canonical inner product on H 6 H, viz. 
((v, w), (', w')) HoH = (vy, Vn ar (w, Ww’). (B.234) 


We now prove an important property of self-adjoint operators: 


Proposition B.72. The adjoint a* of any operator a: D(a) > H is closed. In partic- 
ular, self-adjoint operators are closed. 


Proof. The proof can be elegantly given in terms of the graph G(a). Defining 


u:HOH > HOH: (B.235) 
UW, Wo) = (—YWo,W), (B.236) 


it is easy to verify that wu is a unitary operator, and that 


G(a*) = u(G(a)*) = (uG(a))*. (B.237) 


Hence G(a*) is closed by Proposition B.57.1, and the claim follows. 


In the the context of spectral theory, we will see later what the real importance of 
self-adjointness (and, more generally, closedness) is. It is time for some examples. 


Proposition B.73. Let H = ¢?(X), with X countable for simplicity, and for f € 
£°(X) define the multiplication operator m+ : H — H by 


myw=fy, (B.238) 
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i.e., mp W(x) = f(x) W(x). Then mf is bounded, with norm, cf. (A.107), 
Ilmpl] = II lleo- (B.239) 


More generally, let H = L?(X) for some o-finite Borel space (X,Z,uU), and for 
f € L®(X), define my in the same way. Then mg is again bounded, with norm 


I|mpl| = IFIIE°- (B.240) 
Finally, let f :X —> R be measurable (but not necessarily essentially bounded). Then 
D(mp) = {y € L*(X) | fy € L*(X)}. (B.241) 

is dense in L*(X), and if f* = f, the operator my : D(my) + L?(X) is self-adjoint. 


Proof, On (?(X) we have || fy||2 < || f|l.0|| wl|2, and hence ||m,|| < || flo. Assume 
f £0. Then || f||.. > 0, and for any 0 <t < || f||.. there is x, € X such that | f(x,)| >t, 
so that YW = 14,,} € (X) satisfies ||mry¥;||2 = |f(x:)| >t, whence ||m|| >t. This 
holds for all 0 < ft < ||f||.., hence ||my|| > ||f||.., which yields (B.239). 

To prove (B.240), again assume || f||SS > 0 and 0 <tr < || f||SSS. Then the set 
X; = {x © X,|f(x)| >t} is measurable, with 1(X;) > 0. Since (X,Z, 1) is o-finite, 
there is X; C X; with 0 < U(X/) < o. Take y = ly, so that || fy||2 > || y|l2, etc. 

To prove the density of D(ms), for n € N define X,, = {x € X | |f(x)| <n}, so that 
X =U,X,. For each y € L?(X) we then have ly, v € D(my). Writing 9, = 1g W, 
we have (WY, @n) = fx, du |w|?, hence (y,@,) =O iff y= 0 p-a.e. on X,,. This is true 
for each n € N iff yw =0, so the required density follows from Proposition B.57.4 


In the last claim (where f*(x) = f(x)), the domain D(m*) consists of all y € 
L?(X) for which the map 9 ++ fy du Wf¢ is bounded; by Theorem B.66 this is the 
case iff fy € L7(X), so that D(m*) = D(mf). Moreover, (B.232) obviously holds 
for a* = my (if f takes complex values, then m} = my, still on D(m;) = D(my)). 


For quantum mechanics, a key example is H = L?(R) with f(x) =x, ie., the position 
operator. It then follows from Proposition B.73 that x is self-adjoint on the domain 


D(ms) = {ye L*(B) | [ dxx2ly(a)? <=}. (B.242) 


See also §5.11. It happens often that a given operator on some domain is not closed 
as it stands, but can be made so by slightly enlarging its domain. Thus an operator 
a: D(a) + H is closable if the closure of the graph G(a) in H @H is the graph of 
a closed operator a”, called the closure of a, i.e., G(a)” = G(a_ ). The following 
easy lemma is very useful in proving closability (the proof is a definition chase). 


Lemma B.74. Each of the following conditions is equivalent to closability of a: 


1. If (W,) is a sequence in D(a) such that W, — 0, and if its image (aWy) converges, 
too, then aW, — 0. 
2. The domain D(a*) of the adjoint a* (see Definition B.70) is dense in H. 
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The domain D(a7 ) of the closure a~ of a closable operator a consists of all yw © H 
for which there exists a sequence (Wy) in D(a) such that W, > y and aW, converges, 
so that a” W = lim, aY;,. Finally, if ais closable, then a~ = a™ and (a )* =a". 


An equality a = b between unbounded operators always stands for D(a) = D(b) 
and a = b. Furthermore, a C b means D(a) C D(b) and b = a on D(a). 


Definition B.75. Let a: D(a) > H (where D(a) is dense) be an operator. 


e IfaCca’ ie, if (ap, W) =(9,ay), 9, W € D(a), then a is called symmetric. 
e [fais closable and a~ = a* (in which case the closure a~ of a is self-adjoint), 
then a is called essentially self-adjoint. 


It follows from Lemma B.74 that a symmetric operator is closable (because D(a*), 
containing D(a), is dense). For a symmetric operator one has a C a~ =a™* Ca*, 
with equality at the first position when a is closed, and equality at the second posi- 
tion when a is essentially self-adjoint; when both equalities hold, a is self-adjoint. 
Conversely, an essentially self-adjoint operator is symmetric. A symmetric operator 
may or may not be essentially self-adjoint; we will not discuss this problem here. 


As in the finite-dimensional case, the notion of the adjoint allows one to define a 
projection as an operator e : H — H that satisfies e? = e* = e. However, Proposition 
A.8 should be slightly adapted in order to cover the infinite-dimensional case: 


Proposition B.76. There is a bijective correspondence e ++ L between: 


e projections e on H; 
e closed linear subspaces L of H, 


still given by (A.27) - (A.28), where now {v;}icz is a basis of L, and the latter sum 
must be applied to fixed yw € H according to Definition B.6 with V = H, i.e., 


ey=) (vu, W)v;, YEH. (B.243) 
iel 
Alternatively, without invoking the concept of a basis, one may use the decomposi- 


tion (B.203) as proved via Lemma B.58, to define e directly by ey = ewl. 


Proof. The linear subspace L = eH is closed, since e is bounded by Theorem B.68. 
Conversely, note that since L is closed, it is a Hilbert space, so that it has a basis 
by Theorem B.63. The sum in (B.243) then converges by Lemma B.59, and since 


(g,ew) = (vj, W) (9, vi) = Y (vj, 9) (W, vi) = (W,e@) = (e@, W); 


ic] icl 
Py = Lv, wer = V (v, y)(vjv)v) = L(v, wv = ew. 
ic] i,jel ‘cy 


the operator e is a projection (in the second computation we used boundedness of e 
to pull it through the sum). Next, (B.243) is independent of the choice of a basis of 
L, since if {Dy };¢y is another basis of L, for arbitrary @ € L we may compute: 
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(9, YO(0i, W) Ri) — YE (dy, W) Or) = Y(@, Vi) (D1, W) — V1 (@, By) (D7, W) 


ie] vel’ ie] vel 
= (9, W) — (9, v) =, (B.244) 


where we twice used (B.215), applied to the Hilbert space L. Hence 


¥°(9, vi) (vi, YW) = YQ, v7) (vy, W). (B.245) 


iel vel 
Finally, we prove bijectivity of the correspondence L $+ e: 


e Given L, by Lemma B.59 (applied to the Hilbert space L), ew € L for any y € H, 
whereas if y € L, then ey = yw by Theorem B.61 and (B.210). Hence eH = L. 

e Given e, we first note that for any 7 € eH = L, by definition we have ¥ = ey 
for some y € H, whence ey = ey = ew = X. Now pick a basis {v;} of the 
Hilbert space eH, so that in particular ev; = v;. For arbitrary 9, w € H, writing 
9 =eg+(1—e)9 =@!+¢+, so that g! € Land hence eq! = |, we compute 


(lew) — Vol, vi) (viv) = (g!.w) — (gly) =0: 


(p~,ew) —) (97, v)) (Uj, ¥) = (@, (1 —e)ew) — (9, (1 —e)0;) (v;, w) = 0, 


i i 


where is the first line we used (B.215), applied to the Hilbert space H. 


It is easy to see why the sum (B.243) cannot, in general, converge in norm without 
the y, i.e., in the original (finite-dimensional) form (A.28); it suffices to take e = 1 
(for H = @ (N), for simplicity). Writing e, = Y_,|v;)(v;|, where, for example, 
v; = 6;, for any unit vector y and m > n, from (A.18) we have 


llem — enll> = || (m — en) YI" = y (vi, W)|"- (B.246) 
i=n+1 


Taking y = v; for any n+ 1 < j <m shows that that |e», —p||* > 1 for all m,n, 
so that (e,) cannot be a Cauchy sequence in B(H). This argument applies to any 
infinite-dimensional subspace L. Therefore, if H is infinite-dimensional we should 
work with at least two notions of convergence within the Banach space B(H) (cf. 
Theorem B.33), which for simplicity we state for sequences (more generally, one 
should define the corresponding topologies in terms of convergence of nets): 


© a, — a in the norm topology (or uniformly) in B(A) iff || (ay —a)|| > 0. 
© a, — a in the strong topology in B(H) iff ||(a, —a)y|| > 0, for each y € H. 


The strong topology on B(H) is also called the strong operator topology, in order 
to distinguish it from the strong topology on H itself (which, confusingly, is another 
name for the norm topology) in terms of which it is defined. Similarly, the weak 
topology on H (cf. §B.12) defines a weak operator topology on B(H), as follows: 


© a, — a weakly on B(H) iff (9, (a, —a)W) — 0, for each 9, y € H. 
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In decreasing strength we have ‘norm - strong - weak’, and we show that this trio is 
distinguishable on H = ¢?(N) (and hence on any infinite-dimensional Hilbert space): 


e Let any(x) =0 for x = 1,...,n whilst a,y(x) = w(x) for x > n. In other words, 
if y= (Wy, W,...), then a,w = (0,...,0, Wr+1, Wn42,---) with n zeros. Hence 


co 


lawl = YY lwo, 


x=n+1 


so that ||a, || + 0 as n + © in order for y to be in (?(N). Thus a, — 0 strongly 
(and hence also weakly). If (a,) were to have a norm limit, it therefore would 
have to be zero, too, but since ||a,|| > ||any|| for any unit vector y, taking e.g. 
W = on41, we have ||a,|| > 1 for any n and hence (a,) cannot converge in norm. 


e A slight variation on this example is a, y(x) = 0 for x = 1,...,n (once again), but 
now a, W(x) = w(x —n) for x >n, or, equivalently, a,y = (0,...,0, Wi, W2,...) 
with n zeros. This time, we have ||a,y|| = || y||, so to begin with, a, — 0 strongly 


is excluded. However, (@,anW) = Y>_); P(x +n) W(x), so limy+.(P,any) = 0: 
to see this, take N < © fixed and use Cauchy—Schwarz to estimate 


N co 
LET + Yo o(xt+n)y(x) 


x=N+1 


l(P,anw)| < 


f 1/2 = 1/2 
<i ( Y lee) ‘) “el ( ae, wos . (B.247) 


x=n+1 x=N+1 


Letting N — oo and then n + © yields (9,a,W) — 0, so that a, — 0 weakly. 
But (a,) has no strong limit (for if it existed, it would have to be zero, too). 


It is clear from Theorem B.33 that B(H) is sequentially complete in its norm 
topology. This is true also in the weak and strong operator topologies: 


Proposition B.77. Let (a,) be a sequence in B(H). 


1. If (anW) converges in H for each W © H, then the operator a: H — H defined by 
aw = lim, any is bounded (and hence a, — a strongly, where a € B(H)). 

2. If ((,anW)) converges in C for each @,W © H, then there is an operator a € 
B(H) such that a, — a weakly (and hence ay, — a weakly, where a € B(H)). 


It is instructive to prove this, using two results of independent interest. 


Theorem B.78. Suppose V is a Banach space, W is anormed space (not necessarily 
complete), X is an arbitrary set, and {ay}yex is some family of operators in B(V,W) 
indexed by X. If the family is pointwise bounded in that 


sup{||a,v||,x eX} << (vEV), (B.248) 
then the family is uniformly bounded in that 


sup{||ax||,4 € X} <e. (B.249) 


576 B Basic functional analysis 


This is the Principle of Uniform Boundedness or Banach—Steinhaus Theorem. 


Proof. If W is not complete, use its completion in what follows. Define £°(X,W) 
to be the set of all bounded functions f : X — W, i.e., those function such that 
sup{||f(x)||,« € X} < ce, with pointwise operations. This is easily checked to be 
a Banach space itself in the natural norm || f||.. = sup{||f(x)||,~ © X} (using the 
auxiliary functions f : X — C defined by each f € (°(X,W) as f(x) = ||f(x)||, so 
that || f ||-o = || f||.., one may largely reduce the proof to the ordinary £°(X) case). 
For fixed v € V, define f, : X > W by f(x) = a,(v). By assumption, f € 
£°(X,W), so we may define an operator F : V > £°(X,W) by F(v) = f,. We now 
show that the graph G(F) is closed: if vy, + vin V and Fv, > g in £°(X,W), then 
since uniform convergence implies pointwise convergence, for each x € X we have 


g(x) =lim(Fv,)(x) = lim f,, (x) = lima,v, = a, limv, = a,yv = f,(x) = (Fv) (4). 
n n n n 
Thus g = Fv, and hence G(F) closed. By Theorem B.37, F is bounded, so that: 


I|F'|| = sup{||frlloo,v € V, ||vl] = 1} = sup{|laxvl],v € V, ||v|] = 1x € X} 
= sup{|lay||,x € X} < ©. 


This gives part 1 of Proposition B.77: since lim, a, y exists, sup,,{||a, ||} < 0 for 
each y, hence sup, {||a,||} <0. Since a, > ay implies ||a,y|| > ||aw]|, cf. (B.5), 


| ay|| = lim jan yl <tim|jap||l| yl] < sup{llan||} IYI (B.250) 
n 


so taking the supremum over all unit vectors y gives ||a|| <9. 

As to the second part, suppose a, — a weakly. Since ((@,a,wW)) converges 
for 9, y € H, we have sup, {|(@,anw)|} < o. Using (B.222), this is the same as 
SUP {| fanw(@)|} < e for each @ € H, so using Banach-Steinhaus with V = H*, 
X =N, and a, = fa,y, we find sup,{||fa,wlla*} < °°. By Theorem B.66, this 
gives sup, {||anyW||} < co, and hence, via a second application of Theorem (B.78), 
sup,,{||@n||} <2, or ||ay|| <C < for all n, as in the case of strong limits. 

This time we have to do a little more work to construct the limit operator a. This 
requires a second lemma, which generalizes Proposition A.23 to general Hilbert 
spaces. To this effect, we say that a sesquilinear form B : H x H is bounded if there 
is a finite constant C such that |B(@, w)| < C||@|| y|| for all 9, yw € H. 


Proposition B.79. The relation B(@, W) = (9,aW) provides a bijective correspon- 
dence between bounded (hermitian/positive) sesquilinear forms and bounded (self- 
adjoin/positive) operators a € B(H), cf. Proposition A.22.1. 


Like Proposition A.23, this is a trivial consequence of Theorem B.66. 
To finish the proof of Proposition B.77.2, define B(@, y) = lim, (@,a,Y), so 


BCP, w)| < lim lan | [I] Yl] < sup llan|lelilwll <Cleliilyll. = B25) 


Hence B is bounded, and Proposition B.79 gives the weak limit a € B(H). 
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B.14 Basic spectral theory 


In linear algebra, which in our context means the theory of operators on finite- 
dimensional Hilbert spaces H, the spectrum o(a) of an operator (i.e., a linear map) 
a: H — H was defined as the set of eigenvalues of a. This led to the Spectral Theo- 
rems A.10 and A.15. However, as soon as dim(H) = ©, simple examples show that 
even bounded operators may have no eigenvectors (and hence no eigenvalues) at all. 
For example, take H = L?(0,1) and f(x) =x, with associated (bounded) multiplica- 
tion operator a = my = m,, cf. (B.238); this is just a bounded version of the position 
operator of quantum mechanics. Then the eigenvalue equation a,y = A y implies 
fo dx|x — A/?|w(x)|? = 0, which holds iff |x —A||w(x)| = 0 ae. Since |x — A] is 
nonzero a.e. for any A € C, this implies w(x) = 0 a.e. and hence y = 0 in L7(0, 1). 
More generally, taking H = L7(R“) and f € C;(R®), a similar argument shows that 
the multiplication operator my has eigenvalue A € C whenever the equality f(x) =A 
holds on a set of positive (Lebesgue) measure. Therefore, if f varies sufficiently, 
then m, has no eigenvalues at all (e.g., ind = 1, f €C)([0, 1]) with f’(x) 40 a.e.). 
Even amidst his magnificent oeuvre, covering most of mathematics, it was one 
of Hilbert’s most prophetic insights that finite-dimensional spectral theory could not 
merely be rescued, but also greatly enriched, by defining the spectrum as follows: 


Definition B.80. Let H be a Hilbert space. The spectrum o (a) of a € B(H) consists 
of all 0 € C for which the operator a—2.:H — H is not bijective. The complement 


p(a) =C\o(a) (B.252) 


of the spectrum in C is called the resolvent of a, i.e., z € p(a) iff a—z is invertible. 


Here a— A =a—A-1y, where 1, is the unit operator on H, and by ‘bijective’ and 
‘invertible’ we a priori mean: injective and surjective. This set-theoretic notion of 
invertibility is considerably strengthened by Corollary B.35, according to which the 
set-theoretic inverse of a— A : H > H, if it exists for a € B(H), is automatically in 
B(H). Consequently, we may equivalently say that A € o(a) if a—A is not invertible 
in B(H). This means that if z € p(a), then the equation (a—z)w = 9 for w € H: 


e actually has a solution, since (a — z) is surjective; 
e has a unique solution, for (a — z) is injective; 
e has a unique solution that continuously depends @, as (a —z)~! is bounded. 


Thus Definition B.80 becomes a special case of the following purely algebraic idea: 


Definition B.81. Let A be a (complex) algebra with unit. The spectrum o(a) of 
a €A consists of all 1 € C for which the operator a— A is not invertible in A. 


The notation (B.252) also extends to this case. This generalization is especially pow- 
erful when A is a Banach algebra, and, particularly a C*-algebra, cf. Definition C.1. 
The latter case actually incorporates Definition B.80: 


Proposition B.82. For any Hilbert space H, the set B(H) of all bounded operators 
on His a C*-algebra with unit in the operator norm (A.18) 
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The proof of Proposition A.7 goes through unchanged. In a different direction: 

Proposition B.83. Let A = C(X), where X is a compact Hausdorff space. Then 
o(f) =ran(f). (B.253) 


Proof. Since multiplication in C(X) is pointwise, if f — 2-1 has an inverse, it must 
be 1/(f —A- 1x). This function exists (and is continuous) iff A ¢ ran(f). 


Theorem B.84. Let A = B(H) or, more generally, a unital C*-algebra, or, even more 
generally, a Banach algebra with unit 1, (cf. Definition C.1). Then the spectrum 
0 (a) of any a € A is a nonempty compact subset of C. 

Furthermore, defining the spectral radius of a € A by 


r(a) =sup{|A|,a € o(a)}, (B.254) 
for general unital Banach algebras we have 
r(a) < llall, (B.255) 
as well as Gelfand’s spectral radius formula 


r(a) = lim |a"|1/". (B.256) 


n—-oo 


If a © Aga is a self-adjoint element of a unital C*-algebra, such as A = B(H), then 
r(a) =|la|| (@° =a). (B.257) 


Proof. The claim about the spectrum obviously follows from the following facts: 


1. o(a) is a bounded subset of C. 
2. (a) is a closed subset of C. 
3. o(a) is anonempty subset of C. 


Eq. (B.255) is equivalent to the implication |A| > ||a|| > A € p(a). For A 40 we 
have (a—A) =A((a/A) — 1), so, rescaling a if necessary, we only need to show 
that if ||a|| < 1, then 1 € p(a). Indeed, in that case the geometric series Y,a* for a 
converges absolutely and hence (A being a Banach space) converges, with 


rene ee (B.258) 
k=0 


the proof is virtually the same as for complex numbers. Thus | € p(a). 
Fact 2 is equivalent to the set A, of of invertible elements in A being open in A. 
Indeed, for given a € A,, take ab € A for which ||b|| < ||a~!||~!. This implies 


\la~ "|| < \Ja~*|| |B|| < 1. (B.259) 
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Hence by (B.258) for ||a|| < 1, the operator a+b = a(1+a~'b) has an inverse, 

namely (1 +a~'b)~!a7!. Taking € < ||a~"||~!, it follows that all ¢ € A for which 

||a—cl| < € lie in A, (which is therefore an open subset of the metric space A). 
For the third claim, take a € A and define f : C 4 A by f(z) = z—a. Since 


I[fZ+5) — FRI = 64, 


we see that f is continuous (take 6 = € in the definition of continuity). By part 2 
of the proof, f~!(A,) is open in C. But f~!(A,) is the set of all z €¢ C where z—a 
has an inverse, so that f~'(A,) = p(a). This set being open, its complement o(a) 
is closed. Now define 


g:p(a) 3A; (B.260) 
zt (z—a) lt. (B.261) 


For fixed zo € p(a), choose z € C such that |z—z0| < ||(a—z0)7!||~!. From part 2 
of the proof, with a replaced by a — zo and c replaced by a — z, we see that z € p(a), 
as ||a — zo — (a — z)|| = |z— Zo|. Moreover, because 


Il (Zo — z)(zo — a) "|| = [zo — 2] IIo -@) "I <1, (B.262) 


the power series 


1 1 /2—z k 
yy | —— (B.263) 


20-645, \o 4 


is absolutely convergent and hence convergent for n — oo. By (B.258), the limit 
n —> co of this power series is 


tf  fae\* A eae ae | 
y (2) = (1 ( )) = =g(z). (B.264) 
Z0 — 4 py \Z0 — Zo—a Zo—a zZ—a 


Hence E 
g(z) = ¥ (zo —-z)*(zo-a) *! (B.265) 
k=0 
is a norm-convergent power series. For z 4 0 we write ||¢(z)|| = |z|~!||(1a —a/z)7||| 


and observe that limz,.. 14 —a/z = 1a, since lim... ||a/z|] = 0. Hence we obtain 
lim,_,..(14 —a/z)~! = 14, and 


lim Ig(2)|] = 0. (B.266) 


Let @ € A*; since @ is bounded, eq. (B.265) implies that the function gg : z 4 
(g(z)) is given by a convergent power series (i.e. is analytic), and (B.266) implies 


lim gg(z) =0. (B.267) 


Z00 
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Now suppose that o(a) = @, so that p(a) = C. The function g, and hence gg, is 
then defined on C, where it is analytic and vanishes at infinity. In particular, gq is 
bounded, so that by Liouville’s Theorem of elementary complex analysis it must be 
constant. By (B.267) this constant is zero, so that g = 0 by Corollary B.45. This is 
absurd, so that p(a) # C, and hence o(a) £90. 

We now prove the spectral radius formula (B.256). For |z| > ||a|| the function g, 
defined in (B.260) - (B.261) has a norm-convergent power series 


Se y ey (B.268) 


Zz =0 


On the other hand, we have seen that for any z € p(a) one may find a zo € p(a) such 
that the power series (B.265) converges (i.e. in norm). If |z| > r(a) then z € p(a), 
so (B.265) converges for |z| > r(a), uniformly in z. Therefore (by the theory of 
analytic functions taking values in Banach spaces), eq. (B.268) is norm-convergent 
for |z| > (a), too, which in turn implies that ||a”|| /|z|" < 1 for large enough n (proof 
by contradiction). Since this is true for all z for which |z| > r(a), we must have 


lim sup |la"||!/" < r(a). (B.269) 
n—-oo 


To derive a second inequality towards (B.256), we use the spectral mapping prop- 
erty for polynomials, which states that for any (complex) polynomial p on C, 


o(p(a)) = p(o(a)) = {p(A) |4 € ofa}. (B.270) 


Given some polynomial p of degree n (in a variable z) and some fixed A € C, let 
n 
q(z) = p(z)—A =co] [(z—«x), (B.271) 
k=l 


for some co,...,cx € C. Hence by (A.53) - (A.55), we have q(a) = co T]x_1 (a— cx). 
Now an operator b = bg---b, is invertible iff each factor b; is invertible (in which 
case b>! =b;!---b5'), 80 A € o(p(a)) iff some cz € O(a) (where k > 0, as co #0), 
which is true iff g(c,) = 0, which holds iff A = p(c,). This proves (B.270). 

To conclude the proof of (B.256), we note that since o(a) is closed, there is 
A € o(a) for which |A| =r(a). Since A” € o(a’) by (B.270), one has |A”| < ||a’"|| 
by (B.255). Hence ||a”||!/" > |A| = r(a). Combining this with (B.269) yields 


lim sup ||a”||!/" < r(a) < ja” ||!/" (meEN). (B.272) 
noo 


Hence the limit must exist, and lim,_s2. ||a” || !/" =inf,, ||a’”"||!/" = r(a), i.e., (B.256). 

Finally, given axiom (C.2) for C*-algebras (which include B(H) by Proposition 
A.7 and Theorem B.33), eq. (B.257) follows from (B.256): for self-adjoint a, eq. 
(C.2) reads ||a?|| = ||a||*, so if we take the limit in (B.256) along the subsequence 
of even numbers (as we are entitled to, given convergence), we obtain (B.257). 
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We may also generalize Definition B.80 in a different direction, where we allow 
a: D(a) > H to be unbounded. In that case, there is room for some ambiguity, as 
a possible set-theoretic inverse of a — z, if it exists as a (necessarily linear) map 
(a—z)~': H — D(a) is no longer guaranteed to be bounded. By the argument 
preceding Definition B.81 this would, of course, be desirable, which motivates: 


Definition B.85. Let H be a Hilbert space, and let a: D(a) > H be a possibly un- 
bounded operator (always by definition with dense domain). 


1. The resolvent p(a) consists of all z € C for which a—z: D(a) > H has a 
bounded (linear) inverse (a—z)~!: H > D(a), so that (a—z)~! € B(H). 
2. The spectrum o(a) = C\p(a) is the complement of the resolvent (i.e. in C). 


This provides further motivation for requiring an unbounded operator to be closed: 
Proposition B.86. Let a: D(a) — H be a possibly unbounded operator. 


e Ifais closed, then z€ p(a) iffa—z has a set-theoretic inverse. 
e If ais not closed, then p(a) =90. 


Proof. The graph G(a~!) in H @H is the image of G(a) under the linear homeo- 
morphism (1, W2) +> (Wo, YW), hence if a is closed, then a~! is closed and hence 
bounded (cf. Theorem B.37). Similarly, if G(a) is not closed, then G(a~!) cannot 
be closed either, and hence a~! cannot be bounded. Likewise with a ~+ a—z. 


Thus spectral theory always deals with closed operators a, like self-adjoint ones. 
We now show that Definition B.80 is compatible with our earlier §A.4. 


Proposition B.87. Let V be a finite-dimensional vector space and leta: V — V be 
a linear map. Then a is injective iff it is surjective. 


Proof. This follows from the elementary fact that for any linear map a: V + W one 
has ran(a) ~ V/ker(a). Now if V = W is finite-dimensional one has V = C” (on 
choice of a basis), and one may simply count dimensions to infer that 


dim(ran(a)) =n —dim(ker(a)). 


Surjectivity of a then yields injectivity and vice versa: we have dim(ran(a)) = n iff 
dim(ker(a)) = 0 iff ker(a) =0. 


Note that his proposition yields the very simplest case of the Atiyah—Singer in- 
dex theorem, for which these mathematicians received the Abel Prize for 2004. We 
define the index of a linear map a: V — W as 


index(a) = dim(ker(a)) — dim(coker(a)), (B.273) 


where cokern(a) = W/ran(a), provided both quantities are finite. If V and W are 
finite-dimensional, Proposition B.87 yields the baby index theorem 


index(a) = dim(V) —dim(W). (B.274) 
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In particular, if V = W, then index(a) = 0 for any linear map a (in general, the 
index theorem expresses the index of an operator in terms of topological data; in 
this simple situation the only such data are the dimensions of V and W). 


Corollary B.88. /f a is an operator on a finite-dimensional Hilbert space, then the 
spectrum O(a) of a is the set of its eigenvalues. 


Proof. It immediately follows from Proposition B.87 that a — z is invertible iff z is 
not an eigenvalue of a. 


Returning to Definition B.80, we see that if A is an eigenvalue of a (in that, as 
in finite dimension, there exists a nonzero vector y € H for which ay = Ay), then 
A € o(a) (for a— A) is not even injective, let alone invertible). Thus we may define: 


e the point spectrum o,(a) of aas the set of its eigenvalues, so that 0, (a) C o(a); 
e the continuous spectrum, which (if it exists) is the remainder of o(a), i-e., 


5-(a) = o(a)\o,(a). (B.275) 


If o(a) = 6,(a), we call o(a) discrete. The example at the beginning of this section 
shows the opposite case, viz. 0)(a,) = @ and 6,(a,) = [0,1]. This follows from: 


Proposition B.89. Let H = L?(X,,) for some o-finite Borel space (X,Z,) such 
that (A) > 0 for each open A C X, and let f € C(X). Then 


o(my) =ran(f). (B.276) 
Cf. Proposition B.73. More generally, let f : X —> C be (Borel) measurable. Then 
o (mf) = ess-ran(f), (B.277) 
wegere the essential range ess-ran(f) of f consists of all z € C such that 
Ve >O:u({xEX: | f(x) —z| < e}) > 0. (B.278) 


Proof. The second claim implies the first, for ess-ran(f) = ran(f)~ if f € C(X). 
To prove the second claim, we use the functions @ = lx, y from the proof of 
Proposition B.73, where y € H is arbitrary. If 0 ¢ o(my), then my is invertible, so 
there is b € B(H) such that fb@, = @p. This implies that f(x) 4 0 a.e. on X,, with 
b@n = M1 / fn. Because n € N is also arbitrary and X = UnXn, this gives f(x) #0 
a.e. on X, and since the linear span of the @, is dense in H, we obtain b = m4 /f 
provided b = m;' exists (which should not surprise us, for mrmg = mfg). From 
(B.240), with f ~ 1/f, we then obtain ||1/f||$° = ||1/;|| < 2 (from 0 € p(mf)). 
The point is that ||1/f||SSS < 00 iff there is € > 0 such that |f(x)| > € almost 
everywhere, i.e., U({x € X : |f(x)| < €}) =0. The negation of this condition states 
that Ve >O:p({x EX : | f(x)| < €}) > 0, that is, 0 € ess-ran(f). Therefore, we have 
shown that 0 € o(m,) iff 0 € ess-ran(f); if f € C(X), this is the same as 0 € ran(f)~. 
To finish, note that ms —A-1y =my_,, where f —A is the function x++ f(x)—A. 
This gives A € o(mf) iff 0 € o(my_y), which is true iff A € ess-ran(f). 
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Corollary B.90. If u(f =A) =O for all a € C, then o,(mr) = 0. 


Thus the combination o,(a) = 0 and o,(a) # 0, which is the opposite of the finite- 
dimensional situation, is very well possible. To shed further light on the still some- 
what mysterious idea of a continuous spectrum, we now present Weyl’s theory of 
the spectrum. We say that a possibly unbounded operator a : D(a) — H is normal 
when D(a*) = D(a) and ||a* y|| = ||aw|| for each yw € D(a); if a is bounded, this is 
equivalent to the familiar definition a*a = aa*. Self-adjoint operators are normal. 


Theorem B.91. Let a: D(a) + H be normal. Then A € o(a) iff there exists a se- 
quence (YW) of unit vectors in D(a) such that 


lim ||(a—2) ynl| = 0. (B.279) 


Of course, this is useful only as a new characterization of A € 0-(a); if A € op(a) 
one may simply take y,, = yw for all n, where ay = A yw. For a simple example, take 


H = L’(R); (B.280) 
a=m, (feCc(R)), (B.281) 
A = f(xo) (xo ER), (B.282) 


so that A € ran(f) C o.(my) = o(my), and 


V(x) = (n/m)'/4e-M70)"/2, (B.283) 
Then || y,,|| = 1 and lim, ||(m—A)wW,|| =0, although (y;,) has no limit in L7(R). 


Proof. One direction is easy by reductio ad absurdum: if the given sequence (Y;,) 
exists yet A € p(a), then, since (a—A)~! would exist and would be bounded, for 
any sequence (@,) in H, Q, — O implies (a—A)~!@, — 0, so taking @, = (a—A) Wn, 
we find that (a—A)y, — 0 implies y, — 0. Therefore, the assumption || y,|| = 1 
cannot be true, and hence A ¢ p(o(a), which is to say that A € o(a). 

The converse direction requires two instructive lemmas of independent interest. 


Lemma B.92. Let a € B(H) (or, more generally, let a: D(a) > H be closed). Then 


ran(a)~ = ker(a*)*; (B.284) 
ran(a) = ker(a*). (B.285) 


In particular, we have ran(a)~ = H iffker(a*) = {0}. 
Furthermore, we say that a is norm-positive (a neologism!) if there exists & > 0 
such that ||ay|| > o||w|| for each y € H (or each y € D(a)). Then: 


1. If ais norm-positive, then ran(a) is closed. 
2. The operator a is invertible iff a is norm-positive and ker(a*) = {0}. 
3. A normal operator is invertible iff it is norm-positive. 
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The last point provides the remainder of the proof of Theorem B.91, for if A € o(a), 
then a—A is not invertible, so for each € = 1/n there is a unit vector y,, € H (or 
w € D(a)) such that ||(a— A) Y|| < 1/n, and hence we have our sequence (Y;,). 


It remains to prove Lemma B.92. Eqs. (B.284) - (B.285) are easy exercises, using 
(B.204). For clause 1, if (@,) is a Cauchy sequence in ran(a) converging to 9 € H, 
then @, = aY, for some YW, € D(a). Since || Win — Vall < @7'|Qn —@m|l, the se- 
quence (¥,,) is Cauchy, too, and if y, > y, then 9, > ay = @, so @ € ran(a); in 
the unbounded case this is because a is closed. For clause 2, if a is invertible, then 
for y € D(a), we have ||y|| = ||a~!ay|| < ||a~"||||awl|, since a~! is bounded, and 
therefore a is norm-positive with (for example) a = ||a~!||~!. Moreover, invertibil- 
ity implies surjectivity, i.e., ran(a) = H, and hence ker(a*) = {0} by (B.284). 

Conversely, if a is norm-positive, then it is trivially injective, and if ker(a*) = 
{0}, then ran(a)” = H, again by (B.284). But since a is also norm-positive, 
ran(a) =ran(a) so ran(a) = H and ais surjective, too. Clause 3 now also follows, 
since for normal operators a we have ker(a) = ker(a*), so a being norm-positive 
implying ker(a*) = {0} in any case, now also implies ker(a) = {0}. 


The same lemma yields crucial information on spectra of self-adjoint operators. 


Theorem B.93. Ifa: D(a) > H is self-adjoint, then o(a) C R, and if two eigenval- 
ues A,X! € 6,(a) are different, then corresponding eigenvectors are orthogonal. 
Furthermore, for each z € C exactly one of the following possibilities applies: 


e z€/p(a) iffran(a—z) =H; 

e 2€0;(a) iffran(a—z)” =H but ran(a—z) 4H; 

e 2€0,(a) iffran(a—z) AH. 

Proof. If a* =athen (w,aw) is real, so |(y, (a—z)y)| > |Im(z)||| w||? for any z€ C. 
Combined with Cauchy—Schwarz, this gives the inequality 


Therefore, for z € C\R the normal operator a — z is norm-positive, and hence invert- 
ible by Lemma B.92.3, so that o(a) C R. Next, ifaw =Avw and ay’ =1'y’ 


(a—z)w|| = |[Im(z)|l yl. (B.286) 


(WW!) = 7— liu w) — (WAY) = 7— la’ a)y’) =0. (B.287) 


given that 2,4’ € R and assuming A’ 4 A and a* =a. 

Furthermore, for z € C\R, we have z € p(a) and hence trivially ran(a — z) = 
H; conversely, the latter property states surjectivity of a — z, whilst (B.286) yields 
injectivity, so jointly, z € p(a). For z € R, assuming ran(a — z) = H, eq. (B.285) 
yields ker(a* —z) = {0}, but since a* = a and Z = z, this is just injectivity of a—z, 
whence once more z € p(a). Similarly, if z € R, then ran(a—z)” #H iff ker(a—z) F 
{0}, which yields the third case z € 6p(a). The middle case is all that remains. 


This result reconfirms Corollary B.88 to the effect that continuous spectrum cannot 
occur if dim(H) < », since in that case (where linear subspaces are automatically 
closed) the second scenario in Theorem B.93 is impossible. 
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B.15 The spectral theorem 


Although he did not live to see it, on Hilbert’s viosnary Definition B.80 of the spec- 
trum, part | of Theorem A.15 still holds verbatim even if H is infinite-dimensional: 


Theorem B.94. Let H be a Hilbert space, suppose a € B(H) is self-adjoint, and let 
C* (a) be the C*-algebra generated within B(H) by aand 1 y (that is, the intersection 
of all C*-algebras containing a and 1). Then C*(a) is commutative, and there is a 
(necessarily isometric) isomorphism of (commutative) C*-algebras 


C(o(a)) > C*(a), f > F(a), (B.288) 
which is unique if it is subject to the following conditions: 


e the unit function 1g(q): A + 1 corresponds to the unit operator 14; 
e the identi fihchion ae : A+ A is mapped to the given operator a. 


The map f +> f(a) is called the continuous functional calculus. In particular, 


(tf +g)(a) =tf(a)+a(a); (B.289) 
(fg)(a) = fla)g(@): (B.290) 
f(a)" = f(a). (B.291) 


It is worth mentioning that by Theorem C.62 (cf. Appendix C) an isomorphism 
of C*-algebras is automatically isometric, but in this case the equality 


IF@II = llF lle, (B.292) 


acts as a lemma in the proof that (B.288) is an isomorphism, so we need to prove it 
explicitly; cf. (B.225) for the left-hand side, and (1.24) for the right-hand side. 
Note that Theorem B.94 is even true for the larger class normal bounded opera- 
tors a (which might even be defined by the property that C*(a) is commutative), but 
for applications to quantum mechanics it is sufficient to deal with the self-adjoint 
case (which even mathematically is not a restriction, as it implies the normal case). 


Proof. We repeat (A.52) and (A.53) - (A.55), obtaining a map f +> f(a) defined for 
polynomials f on R, restricted to o(a) C R. The *-algebra P* (a) of all polynomials 
in a is dense in C*(a) by definition of the latter, since one cannot have a smaller 
C*-algebra in B(H) containing a and 1y than the norm-closure of P*(a). In order to 
take advantage of this, we need the following lemma. 


Lemma B.95. For any a € B(H) and any polynomial p on C, we have 


o(p(a)) = p(o(a)) = {p(A) |A € o(a)}; (B.293) 
llal| = Vr(a*a), (B.294) 


see (B.254). In particular, if a* = a, then ||a|| = r(a), ef. (B.257). 
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This is part of Theorem B.84, but we now give a direct proof of the second part. We 
first note that if a* = a, then either ||a|| or —||al| (or both) are in o(a). To show this, 
take a sequence (Y,,) of unit vectors in H such that lim, ||ay,|| = ||a||. Then 


|| (a? = |la|7) wall? = ((a? — all?) Wn, (@? = |la|7) Yn) 
= |la° Wall? + llall* — 2 hall" |layn|l? 
< 2Ila\|* —2|Ia|)? Jawall?, (B.295) 


so that lim, || (a* — ||a||*)Wn||* = 0, and hence ||a||? € o(a”) by Theorem B.91. But 
part 1 of the lemma gives o(a”) = {A? | A € o(a)}, so that +||a|| € o(a). 

The second observation is that, for general a € B(H), if some z € C has |z| > |lal|, 
then z € p(a). This follows from (part 1 of) the proof of Theorem B.84. Thus we 
firstly have r(a) > |\a|| (a* =a), and secondly (for all a), r(a) < |lal|. 


Using Lemma B.95, we now prove that (B.292) holds for real polynomials f = p: 


I|p(4)|| = r(P(@)) = sup{|A],4 € o(p(a))} = sup{|A|,a € p(o(a))} 
= sup{|p(A)|,4 € o(a)} = |[Plle. (B.296) 


The case of complex polynomials p follows from this, since, using (B.289) - (B.291), 


llp(a) ||? = |p(a)*p(@)|| = IPP @)Il = [lp lle = llc: (B.297) 


Thus we have proved the isometric *-algebra isomorphism P(o(a)) = P* (a), where 
P(o(a)) and P*(a) are the canonically normed vector spaces of all finite polyno- 
mials in t € o(a) and in a € B(H), respectively. Neither is complete (when H is 
infinite-dimensional and a # 0), but given isometricity, it is easy to pass to their 
completions, which by Weierstrass and by definition are C(o(a)) and C*(a), re- 
spectively. Thus for f € C(o(a)) we find a sequence (p,) in P(o(a)) such that 
Pn — f (from which it follows that (p,) is Cauchy in C(o(a))), and define 


f(a) = lim pn(a); (B.298) 


this limit exists because || pn(a) — pn(a)|| = || Pn — Pmlleo, $0 that (pn(a)) is Cauchy 
in the Banach space C*(a). Furthermore, if p}, > f, and f’(a) = lim, p/,(a), then 


Ifa) — f'(@)|| = tim || Pn (@) — Pp (@)|| = lim |] Pn — Phlleo = 0, (B.299) 


so f’(a) = f(a). From (B.296) - (B.298) and continuity of the norm—i.e. || f(a)|| = 
limp ||Pn(a)||, which gives d (B.292)—the map f +> f(a) is isometric and hence 
injective on C(o(a)), and the above construction trivially makes it surjective. 
Finally, the properties (B.289) - (B.291) follow from (A.53) - (A.55) by con- 
tinuity. These properties also imply the uniqueness of the map f +> f(a) given the 
conditions states in the theorem, because these conditions and (A.53) - (A.55) define 
the map on P(o(a)) and hence, by continuity, also on C(o(a)). 
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For a nice reformulation of Theorem B.94 in terms the Gelfand spectrum, cf. §C.4. 
For later use (cf. Proposition B.98 below) we add a related result. 


Lemma B.96. [fa € B(H) is self-adjoint, then 
llal| = sup{|(y,ay)|, w € A, || yl] = 1}. (B.300) 
In particular, if a,b € B(H) are both positive and a < b, then |la\| < ||b]]. 
Proof. Define the numerical range v(a) of an arbitrary a € B(H) as 
v(a) = {(y,ay), wy € H, ||| = 1}. (B.301) 


Clearly, if A € op(a), then A € v(a). If A € o,(a), then, in the notation of Theorem 
B.91, by Cauchy—Schwarz and normalization of yw, we have 


(Yn, (4A) Yn)| < |(@—A) val). (B.302) 
Hence in view of (B.279) we have 


lim (Yn,aYn) =A. (B.303) 


So A € v(o)~, whence o(a) C v(a)~, and hence r(a) < sup{|A|,A € v(a)}. From 
Cauchy—Schwarz, in (B.301) we have |(yw,ay)| < ||a||. If also a* = a, by (B.300), 


lla|| =r(@) < sup{|A],A € v(a)} < |lall. 


Hence we have equalities everywhere, and (B.300) follows. 


Generalizing parts 2 and 3 of Theorem A.15 to the infinite-dimensional case 
requires some motivation. To this effect, note that the continuous functional calculus 
at f (a) is positive, i.e., if f > 0 pointwise, then f(a) > 0 in that (yw, f(a) w) > 0 for 
each w € H. Indeed, we have f > 0 iff f = g*g for some g € C(o(a)), with g* (x) = 
g(x) as usual, and hence, by (B.290) - (B.291), f(a) = g(a)*g(a) and therefore 
(wy, f(a)W) = \|g¢(a)||? > 0. By Corollary B.17, if y € H is a unit vector, there is a 
probability measure [Ly on O(a) such that for each f € C(o(a)), 


(Ww flay) = i @ diyf. (B.304) 


The key to the envisaged generalization of Theorem A.15 is that the integral on the 
right may actually be defined for a far larger class of functions than C(o(a)); ef. 
(B.29). This suggests that the expression f(a) on the left-hand side should similarly 
be generalized to a larger class of functions f. However, the L? spaces considered 
in §B.6 are defined on the basis of some measure [U; since [dy in (B.304) varies with 
y and f(a) should be independent of y, it is appropriate to use the space A(o(a)) 
of bounded functions f : o(a) — C that are measurable with respect to the Borel 
o-algebra on o(a) (which consist of the Borel sets on R intersected with o(a)). 
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Since both boundedness and measurability are preserved under uniform limits (mea- 
surability even being preserved under pointwise limits), A(o(a)) is complete in the 
sup-norm, which makes it a commutative C*-algebra (under pointwise operations). 
Among all functions in A(o(a)), we will be particularly interested in the charac- 
teristic functions 14, where A C o(a) is measurable. The expressions 


e4 = 1a(a), (AC o(a)); (B.305) 
€4 = eano(as (ACR); (B.306) 
y= 1,3 (4), (A€ 0,(a)), (B.307) 


to be defined below, where A is a Borel set (and eg = 0 by convention), are the spec- 
tral projections of a (which are of fundamental importance to quantum mechanics). 


Lemma B.97. Any positive function f € &(o(a)) is a pointwise limit of some 
monotone increasing bounded sequence (f,) in C(o(a)), written f, 7 f. That is, 


O< filx) <-++ <fal®) < fati (x) <<: Loca): (B.308) 
f(x) = tim fn(x), x€o(a). (B.309) 


Proof. We start with f = 1x, where K C o(a) is compact. Then K =M,U, for cer- 
tain open sets U,, (this is true for any second countable space), and taking “Urysohn” 
functions f, for each U, (ie., fr € Ce(Un), O< fn(x) < 1 for x € o(a), and f,(x) = 1 
for x € K), we obviously have f, > lx. Next, if U C o(a) is open, we have 
U =U,K, for suitable compact K,, (since R and hence o(a) is o-compact), so 
1x, — lu. This also gives 1c for closed sets C = o(a)\U, since le = 1g(a) — lu. 
Using the so-called Borel hierarchy, it can be shown that any Borel set A C o(a) can 
be constructed from open and closed sets in at most a countable number of steps, 
at each of which a countable union or intersection of sets from the previous steps 
is used. This gives 14 for any Borel set, and hence also yields the simple functions 
§=Yxecxla, with cy, > 0. For arbitrary measurable f > 0 (not necessarily bounded 
and not even necessarily finite) it is a standard result in measure theory that there is 
a sequence (s,) of simple functions such that s, 7 f: to wit, define 


Ang = {x € O(a) |2-"k < f(x) <2-"(k+41)}; (B.310) 

An = {x € (a) |n< f(x) <o}; (B.311) 
2"n-1 

Sp =n-lg, +2" Yo kla,,- (B.312) 
k=1 


Relabeling the (at most) countable number of sequences thus obtained as a single 
sequence then gives a positive sequence (h,,) in C(o(a)) such that h, > f pointwise. 

A final trick turns (/,) into a monotone increasing bounded sequence (f;,): for 
m >n, define frm = min{h,,...,/m}, which is monotone decreasing in m and posi- 
tive, and hence has a (pointwise) limit f;, = limp. frm. The ensuing sequence (fn) 
is monotone increasing and still converges to f. If f is bounded (as we assume by 
definition of @(o(a))), then (f,) must also be bounded eventually. 
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If f € B(o(a)) and f, 7 f with f, € C(o(a)), we would like to define f(a) as 
lim, f(a), just as in the case where f € C(o(a)) and f,, € P(o(a)). However, in 
the former case convergence f,, — f is merely pointwise, whereas in the latter case 
it was uniform, translated into norm convergence f,(a) > f(a). Pointwise conver- 
gence of functions, then, becomes strong convergence of operators: 


Proposition B.98. [f (a,) is a sequence of positive operators on H for which 


O<a, <+++< ay) < ayy < +++ <cly, (B.313) 


where aj < aj means that (W,aiW) < (W,ajW) for each y € H, then there exists a 
unique positive operator a such that ay, /a strongly, i.e. for each y € H, 


ay = lim any. (B.314) 


Furthermore, a = sup, dy with respect to the partial ordering < on the set of positive 
bounded operators (that is, ay, <a for each n, and if ay < b for each n, then a < b). 


Proof. Recalling Proposition A.4, define a sequence of bounded quadratic forms 
Q,:H > Rby Q,(W) = (W, any). Then (Q,(w)) is amonotone increasing bounded 
sequence for each y € H, so that O(w) = lim, Qn(W) exists. Like each Q,, also 
Q satisfies (A.8) - (A.9). Since |Q,(w)| < c||w||? and hence |Q(w)| < cll wll’, it 
remains bounded. Hence (A.10) defines a bounded hermitian form B, upon which 
Proposition B.79 yields a bounded operator a, satisfying B(@, y) = (@,ay). Since 


(W,ay) = lim (Wan), (B.315) 


we have a > 0. To prove (B.314), note that (B.315) gives (yw, (a—an)yw) — 0, but 
(B.313) implies a —a, > 0, so that a — a, has a self-adjoint square root ,/a— dy, 
defined by Theorem B.94 (see also Proposition B.99 below). Hence 


(W, (aan) W) = (Va—any, /a— any) = ||\Va—a,y|? 30. (B.316) 


Now if a sequence of operators (b,,) is such that ||b,|| < C for all n, and |b, y|| — 0, 
then also ||b2y|| > 0, for ||b2 || < ||Dn||||bn Wl] < C|lbn || > 0. This applies here, 
SiNCe Gy, < ay form <n, and hence a— ay < a— dm, from which ||a—ay|| < ||Ja— ay 
(see Lemma B.96). Fixing m, this gives ||a—a,|| <C with C = ||a—a,,||, for all 
n> m. So (B.316) implies ||(a— an) y|| > 0, which is (B.314). 

As to the final claim, eq. (B.315) is the same as (W,ay) = sup, { (W,any) }. 


In this proof, we used the following generalization of Proposition A.22: 
Proposition B.99. The following conditions on a € B(H) are equivalent: 


1. (w,aw) > 0 for arbitrary y © H; 

2. a* =aand o(a) CR’; 

3. a=? for some bounded self-adjoint operator c € B(H); 
4. a = b*b for some bounded operator b € B(H). 
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Proof. The proof is the same as in the finite-dimensional case, except that: 


e In 1-2 we use (B.303) to exclude the possibility that some A < 0 lies in o(a); 
e In 2 +3 we need Theorem B.94) to define the square root c = \/a from the 
function \/- : o(a) — R (which is well defined because o(a) C R*). By (B.290) 


with g = f = \/, we then have \/a\/a =a. 


Given some positive f € A@(o(a)), we now use Lemma B.97 to find a mono- 
tone increasing bounded sequence (f,,) in C(o(a)) such that f, 7 f pointwise, and 
subsequently use Proposition B.98 to define f(a) as the strong limit 


f(a) = lim fala)y (w € A). (B.317) 


Arbitrary functions f are then dealt with using (B.30) and performing the above con- 
structing term-wise. This, then, yields f(a) for any a* =a € B(H) and f € &(a(a)). 

It is natural to ask which corner of B(H) the operators f(a) land in when f € 
&(o(a)), much as we have shown that f(a) € C*(a) for f € C(o(a)). A safe choice 
would be C*(a)~, i.e., the strong closure of C*(a), which by definition contains all 
limits of all strongly convergent nets in C*(a) (so that it certainly contains all limits 
(B.317)), and which is automatically a strongly closed unital *-algebra. This may 
seem too large, but if H is separable, it turns out to be the right choice, because 
these more general limits add nothing to (B.317)). For a more explicit description 
of C*(a)~ we need the commutant S' of any S C B(H), which is defined by 


S' = {a € B(H) | ab = bab € 5}: (B.318) 


the bicommutant of S is S” = (S')’. If S* = S, in that a € S iff a* € S, then S’ is easily 
seen to be a unital *-algebra within B(H). Furthermore, it is obvious that S.C S$”, so 
that the passage S++ S” is some sort of a closure operation within B(H), comparable 
to the closure operation S++ S++ within H itself. Indeed, there is a striking analogue 
of (B.204) at the operator level, due to von Neumann (see Theorem C.127): 


Theorem B.100. /f A is a unital *-algebra in B(H), then 
A" =A-, (B.319) 
where A~ is the strong closure of A in B(H) (which is automatically a *-algebra). 
Corollary B.101. Denoting the strong closure C*(a)~ of C*(a) by W* (a), we have 
W*(a) =C*(a)". (B.320) 


Though not obvious from (B.320), the alternative description through (B.319) shows 
that W* (a) inherits the commutativity of C*(a); in fact W* (a) is a commutative C*- 
algebra, too. Moreover, by construction it is also a von Neumann algebra in that 
W*(a)"” = W*(a), cf. Appendix C. Such unital *-algebras in B(H) are not merely 
norm closed, but are also closed in at least three other natural topologies on B(H), 
including the strong one. The situation may be summarized in the spectral theorem: 
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Theorem B.102. Let a* = a € B(H). The isomorphism C(o(a)) > C*(a) of Theo- 
rem B.94 has a unique extension to a homomorphism 


B(o(a)) > W*(a), a f(a), (B.321) 


for (B.289) - (B.291) continue to hold. In particular, the operator e, in (B.305) is a 
projection. Also, eq. (B.304) remains valid, and for each f © B(o(a)), one has 


IF(@II < Ill. (B.322) 


Proof. The map a+> f(a) is given by (B.317) and preceding discussion. Eqs. 
(B.289) and (B.291) easily follows by limiting arguments. Using the same trick 
as in the proof of Proposition B.98 it can be shown that f(a)* = f(a”), whence, 
using the identity fg = 1((f +g)? — f? — g”), eq. (B.290) follows. This implies 
e% = 14(a) = 14(a) = ea, whilst (B.291) gives e% = 14 (a) = la(a) = ea. 

We prove (B.322) for f > 0; this implies the general case by (B.30) and the 
triangle equality. Writing H; for the set of unit vectors in H, approximating f, 7 f, 
repeatedly using (B.300), the property f(a) = sup, fn(a) established at the end of 
Proposition B.98, and finally using (B.292) for each f,, € C(o(a)), we may estimate: 


IIf(@)|| = sup {l(v. flaw) |} 
wer 


= sup sup{|(w, fr(a)w)|} 
weH, nEeN 


= sup sup {|(y, fr(a)y)|} 
neN ye, 


= sup || fn(@)|] = sup || fall-c 
ncN ncN 
< |lf ll, (B.323) 


where the last inequality is a trivial consequence of the specific limit f,, 7 f. 

Finally, our motivating identity (B.304) follows from the same equality for each 
fn € C(O(a)), upon which Lebesgue’s Monotone Convergence Theorem yields the 
right-hand side, whereas (B.315) gives the left-hand side. 


Of course, in finite dimension, Theorem B.102 coincides with Theorems A.15 
and Theorem B.94. Theorem A.15 implies Theorem A.10 through (A.58) - (A.59), 
and, as we will now explain, in infinite dimension Theorem B.102 similarly implies 
a certain approximate version of Theorem A.10, namely Corollary B.104. 


Lemma B.103. Jf K C R is compact, any f € C(K) may be uniformly approximated 
by simple functions. More precisely, for each € > 0 there is a decomposition K = 
LI. Ai of K as a disjoint union of n < © Borel sets Aj, such that for any x; € Ai, 


<€. (B.324) 


co 
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Proof. Since K is compact, f is actually uniformly continuous on K. This means 
that for € > 0 there is 6 > 0 such that | f(x) — f(y)| < € whenever |x —y| < 6. Since 
(B.324) just states that | f(x) — f(xi)| < € for each i= 1,...,n and each x € Aj, any 
partition for which 0 < |A;| < 6 will do (where |A| = sup{|x—y|,x,y € A}). 


From (B.305), Lemma B.103, and Theorem B.102, we then immediately have: 


Corollary B.104. Let a* =a € B(H). For any f € C(o(a)) and any € > 0, there is 
a partition o(a) = || Aj of O(a) as a disjoint union of n < © Borel sets Aj, such 
that for arbitrary A; € Aj, one has 


[yo Eraien <e. (B.325) 
In particular, for f(x) =x and f(x) = 1 we have 
= Nye Se (B.326) 
i=l 
igo) ene (B.327) 
i=l 


If a has discrete spectrum 6(a) = 6,(a)), then (B.326) - (B.327) reduce to (A.37) - 
(A.38), where eg is defined by (B.307), and the sums converge in norm. 


Hence in this version of the spectral theorem, one approximates a by linear combina- 
tions of projections in a way that reflects the approximation of the identity function 
x ++ x on O(a) by simple functions. Eq. (B.326) is often symbolically written as 


a= de, A, (B.328) 
o(a) 
which may also be given some direct meaning as an operator-valued Stieltjes inte- 
gral, but even so, this neat expression eventually boils down to (B.326) itself. 


Corollary B.105. Let A(A) = fe € A | e* = e* =e}, where A is a von Neumann 
algebra. Then A is the norm-closure of the linear span of P(A), and 


A= P(A)". (B.329) 


Proof. The first claim follows from Corollary B.104. This implies (B.329), which 
may also be proved directly: since P(A) C A, the inclusion A(A)” C A” =A is ob- 
vious. Conversely, let a € A and assume a* = a (if not, decompose a = a’ + ia” with 
a’ and a” self-adjoint). Then W*(a) CA, so that A contains all spectral projections 
of a, cf. Theorem B.102. Moreover, by Corollary B.104, a lies in the norm-closure 
of the linear span of (A), which by Theorem B.100 in turn is contained in A”. 
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B.16 Abelian *-algebras in B(/) 


Compared with Theorem B.94, it seems a weakness of Theorem B.102 that the map 
f +> f(a) fails to be an isomorphism from 4(o(a)) to W*(a). The reason is that 
although the map is surjective (at least when H is separable), it fails to be injective: 
for real-valued f one has f(a) = 0 iff (w, f(a)w) = 0 for all w € H, which by 
(B.304) is the case iff Joa) dy f = 0 for all unit vectors y € H, which in turn is 
the case iff f = 0 a.e. with respect to Ly, in other words, iff f = 0 in L*(o(a), My). 

Thus the right kind of algebra to be isomorphic to W*(a) is L°(o(a),“) rather 
than 4(o(a)), where pt is some (probability) measure on o(a) such that (A) = 0 
iff (A) = 0 for all unit vectors y € H. Indeed, in that case, since by construction 


L*(o(a),H) = Alo(a))/{f | f = Ou-ae.} = B(o(a))/ker(f > f(a)), (B.330) 
our map 4(0(a)) —> W*(a) descends to an isomorphism of von Neumann algebras: 
L”(o(a),u) > W* (a). (B.331) 

This is quite nontrivial; let us first present a case study where everything is clear. 


Proposition B.106. Let H = L?(0,1) = L?((0,1]) (with Lebesque measure), and let 
a =mig € B(A) (where id(x) = x) be the self-adjoint position operator 


aw (x) =xy(x). (B.332) 
Then the map f > f(a) in both Theorems B.94 and B.102 is given by 
f(a) =m, (B.333) 
cf. Proposition B.73. The two *-algebras in B(H) defined by a are given by 


C*(a) = C((0, 1]); (B.334) 
W*(a) =L"(0,1), (B.335) 


both realized as multplication operators (i.e., identifying f with my). Furthermore, 
L”(0,1)' =L*(0,1). (B.336) 


More generally, let K C R be compact, let u be a regular probability measure on K 
with support K, take H = L?(K,) and the define a as in (B.332). Then: 


o(a) = (B.337) 
C*(a) = Cl kK); (B.338) 
W*(a) = a LL); (B.339) 
f(a) =my (B.340) 


L*(K,m) =1"(K 1). (B.341) 
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Proof. We just prove the case K = [0,1] with d(x) = dx; the general case is similar. 

Eq. (B.333) is obvious for polynomials f, and otherwise follows from easy lim- 
iting arguments. Consequently, eq. (B.334) is an instance of Theorem B.94. Every- 
thing else then follows if we can prove that 


C({0, 1})’ =L7(0, 1). (B.342) 


Namely, assuming (B.342), since C((0,1]) c L°(0,1) (and A C B implies B’ C 
A’), we automatically have L*(0,1)’ C C({0,1])/, so (B.342) implies L*(0,1)’ C 
L®(0, 1), and since the converse inclusion is trivial from commutativity of L*(0, 1), 
eq. (B.342) implies (B.336). Furthermore, since W* (a) = C([0, 1])”, taking the com- 
mutant of (B.342) and applying (B.336) yields (B.335). 

So let us prove (B.342). The inclusion L”(0,1) C C([0,1])’ is obvious, since 
MfMg = Mfg = Mgf = Mem, SO We need to prove the converse. Take b € C({0, 1])’ 
and define f = blig4) € L?(0,1). For yw € C([0, 1]) C L7(0,1), we have 


by = bmy1io,1] = myblig,1] = Myf = mym lj, 1] = mpMy 10,1] =mfy, (B.343) 


so b = mr on the dense domain C((0,1]) C L7(0,1), with f € L?(0,1). Now b is 
bounded by definition of the commutant C({0,1])/ and hence ||m,|| < 0. If f ¢ 
L®*(0,1), the proof of Proposition B.73 gives that X, has positive measure for each 
t > 0, whence ||my|| >t for all t, which is a contradiction. Hence f € L®(0,1), in 
which case my extends to all of L?(0,1) by continuity. This extension must equal b, 
so that b = my, and hence C((0, 1])’ C L*(0, 1). 


The following variation on this example turns out to be qualitatively different: 


Proposition B.107. Realizing ¢°(N) as multiplication operators on ¢(N), one has 

&(N)' = £°(N). (B.344) 

Proof, For each N €N, we define a finite-dimensional subspace (7(N) Cc ¢?(N) by 
O(N) = {yw € O(N) | w(x) = 0Vx > NJ, 


with ensuing projection ly : (N) > @(N), ie, Lyw(x) = w(x) for x < N and 
Iyy(x) =0 for x > N. If b € €°(N)’, we have b: (N) > @(N), because ly € 
¢*(N) (and hence y € (7(N), i.e., lyy = y, implies by € ((N), i.e., lyby = by). 
With fy : N— C given by fy = bly, define f : N > C by f(x) = fy(x) for any 
N > x; this is well defined, in that if x << N < M, then fy(x) = fu(x). For any N 
and y € (?(N), as in (B.343) we have by =m fW, which therefore holds on a dense 
subspace Uy? (N) of @(N). Again as in the previous proof, this gives 


I|flleo = |ln2|] = |IBI| <=, (B.345) 


ie., f € £°(N). Thus b =m, = f € €°(N), whence £°(N)! C €”(N). With the trivial 
opposite inclusion, this gives (B.344). 
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Note that since a possible (discrete) position operator (B.332) would be unbounded 

on (?(N), a possible counterpart to (B.335), although it exists, would blast the frame- 

work of the this section (cf. §8B.21). See, however, the proof of Theorem B.118. 
More generally, we have: 


Proposition B.108. Let (X,2, 1) be a o-finite Borel space and realize L”(X, ML) as 
multplication operators on L?(X,W). Then 


L*(X,u) =L*(X,q). (B.346) 


Proof. Writing X =UyenXy with (Xv) <0, which holds by virtue of o-finiteness, 
the proof is practically the same as for X = N (except for the fact that L?(Xy) C 
L?(X) need not be finite-dimensional, but it is closed, which suffices). 


If A Cc B(H) is a commutative *-algebra, we say that A is maximal (abelian) if 
A CBC B(A) for some commutative *-algebra B implies B = A. Any *-algebra A C 
B(H) is abelian iff A C A’ (this is trivial), and is maximally abelian iff A’ = A. To see 
the nontrivial “=” direction, for any subsets C C B(H) and D C B(A) the inclusion 
C C D implies D’ C C’ (as is immediate from the definition of the commutant), so 
B' CA’. Since B is commutative, we also know that B C B’, whence BCA’. If A’ =A 
this gives B C A, so B=A. The condition A’ = A, in turn, implies A” = A, i.e., any 
maximal abelian *-algebra A in B(H) is automatically a von Neumann algebra. 


Corollary B.109. In the setting of Proposition B.108, L*(X, 1) is a maximal abelian 
*-algebra in B(L?(X ,w)), and hence a von Neumann algebra. In particular: 


e L*(0,1) is a maximal abelian *-algebra in B(L7(0,1)); 
e ¢°(N) is a maximal abelian *-algebra in B(€7(N)). 


The above examples suggest a neat reformulation of the spectral theorem. This 
requires a few more concepts from the theory of operator algebras, cf. Appendix C. 


Definition B.110. For any *-algebra A C B(H) and w € H, we write Aw” CH for 
the closure of the linear subspace of all vectors ay, a € A. We say that yw (# 0) is: 


e cyclic for A ifAy =H; 
e separating for A ifay =0 fora €A implies a=0. 


If a* =a € B(A), we similarly say that w is cyclic (separating) for a if W is cyclic 
(separating) for A = C*(a), or equivalently, for A = W* (a). 


The equivalence of the two ways of writing the last definition follows from the 
relation W*(a)yw~ = C*(a)y, cf. Corollary B.101; more generally, y is cyclic 
(separating) for A iff it is cyclic (separating) for its strong closure A~. 

For example, if A = B(H), any vector is cyclic for A, and none is separating. 
On the other hand, if A = C- 1y, then no vector is cyclic for A and all vectors are 
separating. If H = L?(X,) on some finite measure space, then y = 1x is cyclic 
as well as separating for A = L”(X,u). Noting (B.346), as well as the property 
B(H)' =C- 1y, these examples illustrates a general phenomenon: 
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Lemma B.111. [f 1y € A, a vector y is cyclic for A iff it is separating for A’, and 
vice versa. In particular, if A' = A, then w is cyclic for A iff it is separating for A. 
If A is abelian, then every vector that is cyclic for A is also separating for A. 


Proof. If Aw” =H and by = 0 for b € A’, then bay = 0 for each a € A and hence 
b vanishes on a dense subspace of H. Since b is bounded, b = 0. Conversely, let e be 
the projection onto Ay; then e € A’ and hence 1q —e € A’. Since 1 € A we have 
weAw and hence ey = yw, whence (ly —e) yw = 0. If y is separating for A’, this 
implies e = 1y and hence AW” = H. Finally, A is abelian iff A C A’. 


Theorem B.112. Let a* = a € B(H), and suppose some unit vector w € H is 
cyclic for a. Then a is unitarily equivalent to the position operator (B.332) on 
L?(o(a), My), where the probability measure [ly on o(a) is given by (B.304). Fur- 
thermore, through the unitary operator u: H — L?(o(a), My) in question we have 


TEACH att (B.347) 
uC*(a)u-! = C(o(a)); (B.348) 
uW*(a)u-' = L*(o(a), My), (B.349) 


all of which being realized as multiplication operators on L? (0(a), My). 
Moreover, L®(6(a), My) is maximally abelian, and hence satisfies 


L™(6(a),My) =L°(o(a), yy’. (B.350) 
Proof. First, define u on a dense subspace of H by 


u:C*(a)y > L?(o(a), My); (B.351) 
uf(a)y = f, f ©C(o(a)). (B.352) 


It follows from (B.289) - (B.291) and (B.304) that || f(a) w||a = ||f|l2, which makes 
u well defined (since f(a)y = g(a)y implies f = g), as well as isometric. In par- 
ticular, uw is bounded, and hence it can be extended from C*(a) y to H by continuity. 
This extension is surjective, since C(o(a)) is dense in L?(o(a),ty), and there- 
fore u: H + L?(o(a),Uy) is unitary. Then (B.347) - (B.348) hold by construc- 
tion; the special case f = id yields (B.332). As in Proposition B.106, we obtain 
C(o(a))! =L*(o(a), Uy), which implies (B.349) - (B.350). 


Note that this proposition implies that H is separable. When does a self-adjoint 
(or normal) operator a have a cyclic vector? To practice, we first look at H = C”. 


Proposition B.113. Let H = C" and let a = diag(Ay,...,An) be a diagonal matrix. 
Then the following properties are equivalent: 


1. All A; are distinct, i.e., |o(a)| =n (in words, a is non-degenerate); 
2. The operator a has a cyclic vector; 

3. C*(a)' =C*(a); 

4. C*(a) is a maximal abelian C*-subalgebra of B(H). 
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Proof. We first show that all A; are distinct iff 
C*(a) =D,(C), (B.353) 


i.e., the set of all diagonal matrices. To see this, first note that for any f : o(a) > C 
(and any such function is continuous, since o(a) is a finite subset of C) we have 


f(diag(A1,...,An)) = diag(f(A1),--.,fAn)): (B.354) 


this is true by computation for polynomials in a, and these exhaust all functions on 
o(a). It follows that C*(a) C D,(C). We know from (A.49) that C*(a) = C(o(a)) 
whether or not o(a) is non-degenerate, and since dim(C(o(a))) = |o(a)| (i.e., the 
number of elements of o(a)), we obtain 


dim(C*(a)) = |o(a)|. (B.355) 


So if a is non-degenerate, noting that dim(D,(C)) =n we must have (B.353). 
If, on the other hand, a is degenerate, we have |o(a)| = m <n, so that also 
dim(C(o(a))) =m <n and C*(a) C D,(C) is a strict inclusion. Furthermore, by 
direct computation or as a special case of Proposition B.108, we have 


D(C) =D,(C). (B.356) 
To prove | — 2, take the cyclic vector to be 


w= (1,...,1)/vA; (B.357) 


indeed, any vector (z1,...,Z,) is equal to ,/n- diag(z),...,Z,)W, and we have 
diag(z1,---,Zn) € D,(C) = C* (a) by (B.353). For 2 > 1, if H has a cyclic vector y 
for a, then by definition C*(a)w = C", so that dim(C* (a) y) =n. But also 


dim(C*(a)w) < dim(C*(a)), (B.358) 
whether or not y is cyclic for a. If y is cyclic this gives 
n<dim(C*(a)) <n (B.359) 


by (B.355), so that dim(C*(a)) =n, whence |o(a)| =n by (B.355). 

Given this, the implication 1 — 3 follows from (B.356), whilst 3 + 4 follows 
from Theorem A.21. Finally, we prove 4 > 1: we already know that C*(a) C D,(C), 
and by (B.356) and the above argument it follows that D,(C) is maximal. So if C*(a) 
is maximal, then C*(a) = D,(C), and we already know from the first stage of the 
proof that this is equivalent to a being non-degenerate. 


With slightly more effort, an analogous result holds for general Hilbert spaces. 


Proposition B.114. A self-adjoint operator a on a separable Hilbert space H has a 
cyclic vector iff W* (a) is maximal abelian (i.e., W* (a)! = W*(a)). 
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In other words, a has a cyclic vector iff C*(a)’ = C*(a)", cf. (B.320). As we have just 
seen, if dim(H) <e, this is the case iff a is non-degenerate. Consistent with (B.349) 
(with u = 1) and (B.350), the position operator (B.332) acting on the Hilbert space 
L?(o(a), My) is maximal in this sense, with y = 1 g(a) a8 a cyclic unit vector. 


Proof. If w is cyclic for a, then (B.349) and (B.350) (along with the self-evident 
property wA'u~! = (uAu7')') yield W*(a)' = W*(a). Conversely, for any *-algebra 
A C B(H), one can find unit vectors (y;) such that H = 6;H; with H; = Ay; : start 
with any yj, then take any y2 € (AY, )+ (in case this is nonzero, otherwise one 
was already done), etc. To show that this procedure terminates, Zorn’s Lemma must 
be invoked (take the collection of all sets (H;) of mutually orthogonal A-stable sub- 
spaces H; C H that contain a cyclic vector for A). Then y= ),2-"y, is clearly 
separating for A. If A’ = A, then yw is also cyclic for A; cf. Lemma B.111. 


Thus we call a self-adjoint operator a € B(H) maximal if it has a cyclic vector. 


Corollary B.115. A maximal self-adjoint operator a € B(H) is unitarily equivalent 
to the position operator (B.332) on L?(o(a),L), where [ is an appropriate proba- 
bility measure on the spectrum o(a) C R. Moreover, the map B(o(a)) > W*(a) in 
(B.321) induces an isomorphism (B.331) of von Neumann algebras. 


Proof. Take LM = Uy, cf. (B.304), where y is cyclic (or, equivalently, separating) for 
a. The map f +> f(a) from A(o(a)) to W*(a) described in Theorem B.102 can be 
propelled further by conjugation with the unitary u of Theorem B.112, that is, 


fi fla) uf(aju! =my; (B.360) 
B(o(a)) + B(H) > B(L?(o(a), Hy), (B.361) 


where the final equality in (B.360) follows from the computation 
uf (a)u'g =uf(a)g(a)y =u(f-8)(a)y = fe = mys, (B.362) 


where for simplicity g € C(o(a)) C L?(o(a), My), the inclusion being dense. The 
claim then immediately follows from (B.349). 


If ais not maximal, we can still prove a weaker version of Theorem B.112, which 
is sometimes seen as the ultimate version of the spectral theorem. To justify this 
view, take H = C” and let a € M,(C) be self-adjoint (or, more generally, normal). 
By Theorem A.10, H has a basis (v;) of eigenvectors of a, with av; = A;0;. This 
yields a unitary map H — ¢7(n), where n = {1,2,...,n}, defined by wv; = 6; (where 
5;(j) = 6;;, as usual). It is easy to check that uau~'! = my, where A: n > C is 
defined by A (i) = A;, and m, yw = Ay, again as usual. In other words, a is unitarily 
equivalent to a multiplication operator (whose precise nature is left unspecified). 
Conversely, each multiplication operator my on some L?(X,uW) is normal, and is 
self-adjoint if the function f € L°(X, 1) is real-valued (u-almost everywhere). 


Theorem B.116. Any bounded self-adjoint (more generally, normal) operator on a 
separable Hilbert space is unitarily equivalent to a multiplication operator. 
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Proof. As in the proof of Theorem B.114, decompose H = @jc;Hj, where each Hj 
contains some take some separating vector y; for a. Applying the proof of Theorem 
B.112 to each H; then yields unitary isomorphisms H; = L?(o(a), u;), with HW; = My, 
from which, taking direct sums, we obtain a further unitary isomorphism 


H = @L’(o(a), ui). (B.363) 


iel 


Now take the disjoint union X = Uje;o (a), ie., X = UjerX;, where X; = o(a) x {i}, 
endowed with the o-finite measure U = )’; Hi; (so that if A C X is given by A = U;A; 
with A; C X;, we have u(A) = Y; y;(A;)). This gives a second isomorphism 


DL’(o(a),ui) = V(X, u), (B.364) 


defined by mapping 9; € L?(o(a),;) to the same function on X;, extended to X 
by putting it zero on all other X;, i~ j. This map is obviously unitary. By Theorem 
B.112, the isomorphism (B.363) maps the operator a to a direct sum DiMid g(a) of 
multiplication operators, upon which the second isomorphism (B.364) maps this 
direct sum to a (single) multiplication operator m,, where the function g : X — C is 
defined by q(x,i) =x (in which (x,i) € X; C X, so that x € o(a) C C). 


More generally, the operator f(a) on H, for some f € A(o(a)), is first mapped to 
@imy,, where f; is the image of f in L°(o(a), u;) in the obvious way, which in turn 
is mapped to a multiplication operator m;, where f(x, i) = f(x), analogously to the 


position operator q = idg/,) above. This leads to an isomorphism W*(a) = L*(X,u), 
which, by the same reasoning as in the proof of Corollary B.115, also induces an 
isomorphism (B.331) of von Neumann algebras. See also Theorem C.140. 

Finally, proposition B.114 may be generalized, to which end (and also as a result 
of independent interest) we extend Corollary A.20 to the infinite-dimensional case: 


Theorem B.117. Let H be separable and let A C B(H) be an abelian von Neumann 
algebra. Then A = W* (a) for some self-adjoint a € B(H), i.e., A is singly generated. 


Proof. Let A(A) be the set of all projections in A, and let y € H be separating for 
A and hence cyclic for A’ (cf. Lemma B.111 and the proof of Proposition B.114). 
The ensuing subset A(A)y = {ey | e © A(A)} may be uncountable, but since 
any subspace of a separable metric space is separable, there is a countable subset 
Px(A) = {en,n © N} of A(A) such that Ay(A) y is dense in A(A)y, ie., for any 
e € P(A) there is a subsequence e,, in Ay(A) such that limy_,.. en, YW = ey. But 
since P(A) CA CA’ and A’y” = H, this is true not only on y but on a dense set 
of vectors ay, a € A’, so that e,, — e in the strong operator topology. Thus Ay (A) 
is strongly dense in Ay(A), and by (B.329) and Theorem B.100 we have 


Py(A)" =A. (B.365) 


The self-adjoint operator that does the job is now given by von Neumann’s formula 
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a=)3-"(2en — 1). (B.366) 


n 


To see this, let C*(e,,n € N) =C* (en), be the C*-algebra generated by the projec- 
tions e,, so that by construction 


Px(Ay" =C* (en)! (B.367) 


We will show that 
C*(a) =C*(en)n, (B.368) 


which combined with (B.320), (B.365) and (B.367) yields the desired conclusion: 
A= Py(A)” =C* (en) = C*(a)” =W* (a). (B.369) 
The simplest argument for (B.368) uses the Gelfand isomorphism 
C*(€n)n = C(X) (B.370) 
as commutative C*-algebras, cf. Theorem C.8, where the set of characters 
X = {x:C*(en)n 3 C | x(be) = x(b)x(c), x1) = 1} (B.371) 
of C*(en)n is equipped with the weakest topology that makes all maps 


ee ea ae (B.372) 
b(x) = x(b), bE C*(En)n, (B.373) 


continuous. This makes X a compact Hausdorff space, and the isomorphism (B.370) 
is given by the Gelfand transform b > b. Defining s, = 2e, — 1y, we have ||s,|| = 1, 
since s,w = y if y € e,H and s,w = —y if y € (lq — en)H = (enH)*. The series 
(B.366) therefore converges absolutely in B(H), and hence converges, to some limit 
a €C*(€n)n. We claim that its Gelfand transform a € C(X) separates points of X, so 
that by the Stone-Weierstrass Theorem B.51, the *-algebra it generates is dense in 
C(X) (in its canonical sup-norm). Thus a likewise generates C* (ey), and the proof 
of Theorem is ready up to the proof of the above claim, which we now give. 

First, note that since by definition C*(e,), is generated by the projections e,, so 
that by (B.371) (and the automatic continuity this implies, i-e., x € C*(e,)*), each 
x € X is determined by its values on all e,. Therefore, for each pair x;,x; € X,iF j, 
there must be some n € N for which x;(en) 4 xj(en). Consequently, for each i 4 j, 
the set Njj = {n EN | xi(en) A x;(en)} is not empty; let nj; = minNj;. Since for any 
projection e the corresponding function é can only take the values 0 or 1, each S,, 
must take the values +1, so that, with d= ),3~"S,,, we have 


V(a(xj)-a(x)=43 “7+ Yo 43°" 40, (B.374) 


neENjj,n>njj 


since whatever the signs, the sum is always smaller than the first term. 
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B.17 Classification of maximal abelian *-algebras in B(H) 


We now prove the following classification of maximal abelian *-algebras in B(H), 
which forms the basis of the Kadison—Singer Conjecture discussed in §2.6 and 84.3. 


Theorem B.118. Jf H is separable (and infinite-dimensional), and A C B(H) is a 
maximal abelian *-algebra, then A is unitarily equivalent to one of the following: 


1. L®(0,1) C B(L?(0, 1)) (realized as multiplication operators); 
2. £°(N) Cc B(é?(N)) (idem); 

3. L”(0,1) 6 £"(N) C B(L?(0,1) 6 F(N)) (idem); 

4. L®(0,1)@D,(C) C B(L?(0,1) @C"), for some n € N (idem), 


and these possibilities are (mutually) unitarily inequivalent . 


The first claim means that there is a unitary operator u from H to, say, L?(0,1), such 
that the map a+> uau! from B(H) to B(L7(0,1)) restricts to uAu~! = L*(0,1), so 
that A ~ L”(0, 1) as both C*-algebras and von Neumann algebras (and likewise for 
the other possibilities). The last claim, then, means that there is no unitary map from, 
say, L?(0,1) to ¢7(N) that similarly induces an isomorphism L*(0,1) = €”(N). 


Proof. We begin with the easy part, which is the last clause. The key notion to 
proving the claimed inequivalence is that of an atomic projection in a von Neumann 
algebra M C B(H). If we partially order projections on H by (cf. Theorem 2.50 and 
8C.21) 

e</f iffeH C fH, (B.375) 


we say that f is atomic if f £0, and 0 < e < f implies either e = 0 or e = f. This 

property is preserved under unitary equivalence: if M C B(H) and N Cc B(H’) and 

N =uMu! for some unitary u : H + H' (again in the sense that a++ uau™! is an 

isomorphism M = N), then f is atomic in M iff ufu! is atomic in N. The reason is 

that a4 uau! induces an isomorphism of the pertinent posets of projections in M 

and N, so that all order-theoretical notions are preserved under unitary equivalence. 
In the case at hand, the projections are easy to classify: 


1. The nonzero projections in L®((0, 1] are the characteristic functions on measur- 
able subsets of [0,1] of positive Lebesgue measure. Since any such subset prop- 
erly contains another such subset, there are no atomic projections in L®((0, 1]. 

2. The nonzero projections in ¢*(N) are the characteristic functions on N, among 
which there are plenty of atomic ones, namely the one-dimensional projections 
6,, x € N. Thus @*(N) has countably many atomic projections. Moreover, each 
other projection majorizes an atomic one. 

3. Similarly, L*(0, 1) @ €°(N) has has countably many atomic projections, as well 
as uncountably many projections that do not majorize any atomic one. 

4. Since the atomic projections D,(C) are the one-dimensional ones (given by 
diagonal matrices with n — 1 zero’s and exactly one entry equal to unity), 
L®*(0,1) @D,(C) has exactly n atomic projections, as well as uncountably many 
projections that do not majorize any atomic one (namely the ones in L*(0,1)). 
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Any unitary equivalence between two of the entries in the list would have to preserve 
this fine structure of projections, and hence cannot exist. 

We now prove that the list in Theorem B.118 is exhaustive. According to The- 
orem B.117, we only need to look at abelian von Neumann algebras A = W*(a), 
where a is maximal. According to Theorem B.112 and its Corollary B.115 (whilst 
noting that some unitary equivalence a © b induces a unitary equivalence W*(a) = 
W*(b)), we may further restrict our attention to the case where a is the position op- 
erator on L?(K,), where K = o(a) C R is compact and p is a regular probability 
measure (here and in what follows, this is always meant with respect to the Borel 
structure inherited from R > K), with support equal to K, and hence 


W*(a) =L*(K,M) C B(L?(K,u)). (B.376) 
The final step is to further reduce the possibilities by exploiting equivalences. 


Definition B.119. Two measure spaces (X ,Z,m) and (X',X’,u’) are: 


e equivalent if there is a measurable bijection @ : X — X' with measurable inverse, 
and the measures @,,1 and w' on X' are equivalent in the sense that 0,1(A’) =0 
iff u'(A’) = 0 for each A' € X'. Here ~,U is the measure on (X',Z') defined by 


pup (A’) = (gp '(A’)) (Al EL’). (B.377) 


e isomorphic if there is a measurable bijection @ : X — X' with measurable in- 
verse, and @,.U(A’) = w'(A’) for each A’ € £". 


The ambiguity of the notation @~! in (B.377) is innocent: for general measurable 
maps @ : X —> X’ the set p~!(A’) can only denote the pre-image {x € X | p(x) €A’}, 
whereas for invertible maps one might construe g~!(A’) as {g7!(x’) | x € A‘}, 
where ¢~! is the theoretic inverse @~! of @. Of course, these sets duly coincide. 


Lemma B.120. Let K and K' be compact subsets of R, with © and &' the Borel 
structures inherited from R D> K and R D K", respectively (often omitted in what fol- 
lows). Let u and pw’ be probability measures on K and K', respectively, and suppose 
that the associated measure spaces (K,X,u) and (K',Z',u’) are isomorphic. 

Then there exists a unitary operator 


uw: L*(K,M) > L*(K,p’) 
such that 
ul” (K,p)u! = L°(K’,p’). (B.378) 


Note that u does not intertwine the positions operators (B.332) on L*(K, 1) and 
L?(K’, uw’). These operators have already done their job in reducing the situation to 
L?(K,), and from that point onwards (B.378) is exactly what we need. 


Proof. All maps appearing below are assumed Borel. The change-of-variables for- 
mula for a general (i.e., not necessarily invertible) map @ : K — K’ reads 
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| dewe= | ausoe, (B.379) 
K' K 


where g : K’ — C. Under the assumption that @ is invertible, this can be rewritten as 


| aousoo'= fans, (B.380) 
K' K 


where f : K — C. If @ is also an isomorphism of measure spaces, this becomes 


[au'tee| =f aus. (B.381) 


If @,u and pl’ are equivalent and hence mutually absolutely continuous, the 
Radon-Nikodym derivative d(@,) /du’ exists (as does its counterpart d(@,'u')/du), 
and using (B.137) and (B.380), one easily verifies that the operator 


woe TAK, Wy (BW): (B.382) 


ld(9. 7 
uy = oD yoo f (B.383) 


is isometric. Moreover, u is unitary, because it has an inverse, given by 


eS TAK ST: (B.384) 
-liy 
uly = Ae Hh y09. (B.385) 


We give these general expressions for later use; if @,U = 1’, they simplify to 


uy = yo}; (B.386) 
uly =XxX0@. (B.387) 


For f € L°(K,) we then have (cf. Proposition B.73) 
um pu! = Myog-t- (B.388) 


We already know that the map f +> mf injects L*(K, 1) isometrically into B(L*(K,)), 
and analogously for L*(K’, 1’). Furthermore, The map f +> fo @! gives an iso- 


morphism L”(K, U1) Ea i (K’, 1’): the property 
IFoo TE5 = IF IE, (B.389) 


which yields injectivity, may be checked either from (B.240) or from the assumed 
isomorphism of measures (and hence equivalence of measures, which in fact suffices 
for this purpose), whereras invertibility of @ gives surjectivity (since g € L”(K’,u’ 
is the image of f = go@ € L”(K, yu’). Eq. (B.378) follows. 
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The final step of the proof appeals to a deep and fundamental classification theorem 
in measure theory, which goes back to Kuratowski in a form that applies to general 
Polish (i.e., complete separable metric) spaces. This theorem implies: 


Lemma B.121. Let (K,2,) be a infinite probability space (in that infinitely many 
different elements of & have positive measure), where K C R is compact and & is 
the o-algebra inherited from the Borel structure on R. Then (K,Z, WL) is isomorphic 
to exactly one of the following possibilities (called standard measure spaces): 


1. K =(0,1] with u equal to Lebesgue measure Ur; 

2.K =N’ ={2-",n © N}U{1}, equipped with any probability measure w' for 
which p'({2-"}) > 0 for each n € N and p'({1}) =0; 
K =(0,1] with uw =tu,+(1—n)p’, for some0 <t <1; 

4. K = [0,1] with uw =tu, +(1—t)Uy, for somen€ Nand0<t <1, 


where [ly is an arbitrary strictly nonzero probability measure on the n-point set 
n' ={1/n,...,(n—1)/n,1}. (B.390) 


Here we have stated the result in terms of probability measures | on compact spaces 
K C [0, 1]; this is convenient in the context of our proof. To understand the last two 
cases, for general measure spaces (X,Z, 11) we say that A € Y is an atom if for any 
BCA we have either u(B) =0 or w(A\B) = 0 (but not both; this implies (A) > 0, 
whence an equivalent definition of an atom as a set A € 2 having positive measure as 
well as the property that if some measurable subset B C A has measure 1! (B) < (A), 
then p1(B) = 0). In our case at hand (K, 1), each atom A contains a point x € K such 
that u(A) = w({x}) and w(A\{x}) =0, so that modulo null sets we may identify 
each atom A with the measure-carrying point x it contains. Moreover, K can contain 
at most a countable set </ = {x,}, of such points x,. The formulae 


He = Ha + Hes (B.391) 
Ma(A) = w(AN); (B.392) 
Mc(A) = n(A\(AN #)), (B.393) 


then give the canonical decomposition of jz into an atomic part U, and a continuous 
part U-. This, then, is the sense in which the last two cases of Lemma B.121 are 
meant. Note that characteristic functions 14 on atoms A C K yield atomic projections 
in L*(K, 1), linking the two notions of atomicity that play a role in this proof. 

The first entry of this lemma yields the first entry in the list in the theorem. To 
obtain the others, we need a few more unitary equivalences. For the second, define 


u: L?(N’,u') > (N); (B.394) 
uy(n) = /E'(n)w(2"), (B.395) 
and uy(1) irrelevant. This operator is unitary and, just like in (B.378), it intertwines 


uL®(N’,u')u7! = &°(N). (B.396) 
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Note that (B.394) is a special case of (B.383)). The third and fourth cases require 
the following construction: if </ C K is the set of atoms in (K,2, 1), we decompose 


K =(K\#)| |#, (B.397) 


as a disjoint union. For any measure jp this induces an orthogonal decomposition 


L’(K,M) = L?(K\ fm) @L? (am); (B.398) 
L?(K\,) = eL?(K,); (B.399) 
Lf, h) = (lnc py —)E"(K, 1); (B.400) 


where ¢ = Ixy .v and 172(x ,) —e = ly are projections. Using (B.391), this gives 


L?(K\ ofp) = 1°(K, Me); (B.401) 
7 (2,1) = L(@, Ma), (B.402) 


so that at the end of the day we obtain 
L?(K, pb) = L?(K, Mc) ®L?(., Ua). (B.403) 


This in turn induces the decomposition 


L”(K,H) =L*(K, ogee Ma); (B.404) 
L*(K, Uc) = eL ha S L*(K, We: (B.405) 
L”(, Ma) = (1p He) (K,u) 

= (1p2(K py — )L* (KM) (Lp) —&): (B.406) 


Combined with (B.396), this shows that the third entry of the lemma yields the third 
entry of the theorem. To obtain the fourth and last, we need the unitary map 


us L?(n!, bn) > C"; (B.407) 
UWin = V/ Un(m/n)w(m/n) (m= 1,...,7), (B.408) 
which delivers the unitary equivalence 


uL®(n',Un,)u-' = D,(C). (B.409) 


Short of a proof of Lemma B.121, we have (at last!) proved Theorem B.118. 


Thus one of the remarkable novelties of infinite-dimensional Hilbert space is that 
even in the separable case, uniqueness of maximal abelian *-algebras is lost. 

There is a different proof of Theorem B.118 that does not rely on Kuratowski’s 
Lemma B.121, but instead is based on properties of the projection lattice A(A) in A. 
In the following outline of this proof, A is a maximal abelian *-subalgebra of B(H), 
where H is a separable Hilbert space. Hence A is a von Neumann algebra, which is 
generated by its projections. This leaves three mutually exclusive possibilities: 
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1. A has no minimal projections; 
2. A is generated by its minimal projections; 
3. A has minimal projections that do not generate A. 


The following lemma, whose proof we merely sketch, replaces Lemma B.121. 


Lemma B.122. [f H is separable and A C B(H), then P(A) contains a maximal 
totally ordered set 7 (A) that generates A (as a von Neumann algebra). 


Proof. This is proved in two steps. First, Y(A) contains a countable subsets Y,(A) 
that generates A. Indeed, according to Lemma B.111 and Proposition B.114 (and 
maximality of A), H contains a unit vector y that is both cyclic and separating for 
A. Since H is separable, Y(A)w C H has a countable dense subset, which is Y,(A). 
The second step is trickier, namely to construct a maximal totally ordered set 
(A) from Y,(A). This is done inductively. We number Y,.(A) = {e1,e2,...}. 
Starting from Y; = {0y,e1, 14}, we now construct finite totally ordered sets Y,, of 
projections such that Y,, C A,+1 and e, lies in the linear span of Y,. Let 


Pn = {ey = OH, C456) 11 my = Lat, (B.410) 
where e| <--- <e/, (where e < f means e < f and e ¥ f), and define 
Pry = PpUfeet (Chg — centrist =0,.--,%m— 1}. (B.411) 


Given the total ordering in Y,, it is easy to see that each ej + (e),) —ej)en+1 is 
indeed a projection, and, by the same token, that Y,,,; meets its specification. Let 


Puy =UnPn, (B.412) 


which remains totally ordered but typically is infinite, and take the poset Y of 
all totally ordered subsets of Y(A) that contain Y.., ordered by inclusion. Zorn’s 
Lemma then yields a maximal element of Y, and this is our 7(A): this maximal 
element is itself totally ordered, and since its linear span contains each projection 
€n © &,(A), the projections in 7(A) generate A (since the e, already do so). 


The above trichotomy then leaves the following possibilities: 


1. Let yw € H be a unit vector that is cyclic and separating for A. Then 


a: F(A) = [0,1]; (B.413) 
e+ (Wey), (B.414) 


is an isomorphism of posets. It is easy to show that the linear span of the set of 
all vectors a7 !(t)w, t € [0,1], is dense in H, and that the map 


ua '(t)w = 1@,) (B.415) 


extends (by linearity and continuity) to a unitary isomorphism 


B.17 Classification of maximal abelian *-algebras in B(H) 607 
u:H -L7(0,1), (B.416) 

which intertwines A with L*(0, 1) in the sense that 
uAu_' =L*(0,1). (B.417) 


2. This case relies on a general fact about von Neumann algebras M: if e € A(M) 
is minimal, then pMp = C. This implies that if M =A is abelian, then for each 
a €A one has ea = Aa for some A € C. It follows that: 


e Each minimal projection e; in A(A) is one-dimensional. 
e Different minimal projections are orthogonal. 
e ly =Y;4; (strongly), where the sum is over all minimal projections in A. 


Since H is separable, we may assume i € N, so that we obtain a countable basis 
(v;) of H in which e; = |v;)(v;|, and hence have a unitary isomorphism 


u:H ~ @(N); (B.418) 
vj H+ &, (B.419) 


i.e., wis defined by linear and continuous extension of (B.419). Clearly, 
uAu! = €°(N). (B.420) 


3. The first part of the analysis in the previous item still applies, but this time, the 
sum e = );e; over all minimal projections in A is not equal to 17. If there are 
n € N such projections, we obtain 


eH =C", (B.421) 
and otherwise 
eH = (?(N). (B.422) 
We combine these in the notation 
eH = €(k), (B.423) 


where K =n, in which case (?(«) = C” and £”(«) = D,(C), or K = N. Further- 
more, we have 
(ly —e)H =L’(0,1), (B.424) 


as in the first item. By construction, the corresponding unitary 
u:H — €(«)®@L*(0,1) (B.425) 


then satisfies 


uAu-! = £”(K) @L*(0,1). (B.426) 


This finishes the alternative proof (sketch) of Theorem B.118. 
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B.18 Compact operators 


The spectral theorem (in whatever version) on infinite-dimensional Hilbert spaces 
considerably simplifies for a class of well-behaved operators called compact. 


Definition B.123. A linear map a: V + W between Banach spaces V,W is called 
compact if for some (and hence all) d > 0 the image a(V<q) of the closed d-ball 


Veq = {v EV: lvl] <d} (B.427) 


is pre-compact in W (i.e., its closure a(V<q)~ is compact), or, equivalently, if the 
image (av,) of any bounded sequence (vj) in V has a convergent subsequence. 


Before turning to Hilbert spaces, we mention two facts of general interest. 


Proposition B.124. A compact operator is bounded. 


Proof. If not, then for any n € N there is some v, € V<, for which ||av,|| >, so 
that (av,) cannot possibly have a convergent subsequence. 


Proposition B.125. A compact operator a: V — W maps weakly convergent se- 
quences in V to norm-convergent sequences in W. 


Proof. Let (v,) be a sequence in V that weakly converges to v. It is easy to show 
that if a: V — W is (norm) continuous, then it maps weakly convergent sequences 
in V to weakly convergent sequences in W. Therefore, the sequence (av,) weakly 
converges to av. If (av,) failed to converge to av in norm, then it would have a 
subsequence (avp,) such that for some € > 0 and all sufficiently large k one had 


||avn, —av|| > €. (B.428) 


However, (v,), being weakly convergent, is bounded by Lemma B.126 below, and 
hence also its subsequence (v,,) must be bounded. Since a is compact, (avy, ) has 
some norm-convergent subsequence, which necessarily converges to av (since we 
know this is the weak limit of the ambient sequence (av,,) and hence also of any of 
its subsequences, and if a norm-limit exists, the corresponding weak limit must be 
the same). But for large enough k this convergence flatly contradicts (B.428). 


Lemma B.126. A weakly convergent sequence in a Banach space is bounded. 


Proof. Since v, — v weakly, the sequence (@(v;,)) in C converges to @(v) for each 
 € V*, so that sup, {|9(vn)|} < oe. Using the notation (B.129), this may be rewritten 
as sup, {|¥n(@)|} < cc. Using Theorem B.78 (with V ~» V**, W =C, and X =N), 
this implies sup, {||¥,|| } < 0°, and hence sup,,{||v;|| } < °° by Proposition B.44. 


Definition B.123 simplifies if V = W = H is a Hilbert space, since we have: 


Proposition B.127. If the image a(H<,) C H of a linear map a: H > H is pre- 
compact, then this image is in fact compact (and hence a is compact). 
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For the proof, call a Banach space V reflexive if V** = V (i.e. through the canonical 
injection v ++ ¥, cf. Proposition B.44). Hilbert spaces H are reflexive, since H* = H 
by Theorem B.66. Proposition B.127 then follows from yet another lemma: 


Lemma B.128. /f V is a reflexive Banach space and a: V — W is compact, then 
a(V<1) is compact. 


Proof. The proof relies on a corollary of the Banach—Alaoglu Theorem B.48, ac- 
cording to which V<; is weakly compact if V is reflexive (indeed, by applying 
Banach—Alaoglu to V* instead of V, it follows that the unit ball in V** is compact in 
its weak*-topology; if, in addition, V is reflexive, then the inverse of the canonical 
injection V — V** maps the weak*-topology on V** to the weak topology on V). 
So let a: V + W be compact, and let w, be a sequence in a(V<1), Say Wy = AV» for 
some sequence (v,,) in V<1. Then since V<; is weakly compact, v, has a weakly con- 
vergent subsequence v,, in V<j, say limy_4.0. Vn, = v weakly. By Proposition B.125, 
lim 500 @Vy, = av in norm. In other words, (av,) has a norm-convergent subse- 
quence, namely (av, ), with limit in a(V<;). Hence a(V<;) is compact. 


In view of Proposition B.127, we may as well take the following starting point: 


Definition B.129. If H is a Hilbert space, a linear map a: H — H is called compact 
when the image a(H<1) of the closed unit ball in H is compact. 


We write Bo(H) for the set of all compact operators on H. 


Theorem B.130. The compact operators Bo(H) form a C*-algebra in B(H) in the 
operations inherited from B(H). Furthermore, Bo(H) is a two-sided ideal in B(H). 


Unfolding this theorem, the claim consists of the following parts: 


1. Bo(H) C B(A), i.e., a compact operator is automatically bounded. 

2. Bo(H) is a vector space. 

3. If a,b € Bo(H), then ab € Bo(H). 

4. If (ay) is a convergent sequence in B(H) with limit a, i-e., ||an —a|| > 0 for some 
a € B(H), and if each a, € Bo(H), then a € Bo(H). 

. Ifa € Bo(H), then a* € Bo(H). 

6. If a € Bo(H) and b € B(A), then ab € Bo(H) and ba € Bo(A). 


Nn 


Proof. The first clause is Proposition B.124, and the second and sixth (which im- 
plies the third) are almost trivial. For the fourth, we use the following criterion for 
pre-compactness (in a metric space): K C H is pre-compact iff for each € > 0 it can 
be covered by a finite number of open €-balls Be(xi) = {w € H: ||w—xil| < €}, 
where i=1,...,m <0 (i.e., all balls have the same radius €). Given that ||a, — a|| > 
0, for each € > 0 there is n such that ||a, —al| < €/2. Since a,(H<1) is compact, it 
has a finite cover with € /2-balls; in other words, for each y € H<, there is ani such 
that ||a,y — xi|| < €/2. Hence, as || || < 1, we may estimate 


law — xill S (@n-@) Il + lan ¥— Xill S Ilan —allllWll + 3€ < 38+ 36 =€. 
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So a(H<1) has a finite cover with €-balls and hence is pre-compact. This finishes 

the proof from Definition B.123; from Definition B.129, invoke Proposition B.127. 
To prove the fifth clause, we need a result of independent interest. We say that a 

linear map a: H > H is (or has) finite rank if its image is finite-dimensional. 


Proposition B.131. A bounded operator a € B(H) is compact iff it is a norm-limit 
of finite-rank operators. 


Proof. Since it is easy to see that finite-rank operators are compact, the “<=” direc- 
tion follows from clause 4 of Theorem B.130. The difficult direction is the opposite 
one, which we prove by contradiction (as a technical note, our proof assumes that 
His separable, but the claim also holds in the non-separable case, in which it can be 
shown that ran(a) is separable whenever a is compact). 

Pick a basis (v;) of H (or, in the non-separable case, of ran(a)), and define e, 
to be the projection onto the linear span of the first n basis vectors. Given some 
a € Bo(H), define a, = ena. We show that ||a, — a|| + 0. If not, then 


Je > OVNAn > N: |lan—all >, (B.429) 


which in turn implies that for any 6 > 0 there are unit vectors y;, for which we have 
|| (an —a)Wy|| > € — 6. Take 6 = €/2, whence 


de > OVNAn > N: ||(an—a) Vall > €/2. (B.430) 


Now a is compact, so that, noting that y,, € H<1, the sequence (ay,,) has a conver- 
gent subsequence, say with limit @. We may then write 


(an — 4) Wn = (€n — 1 )(AWn - 9 +), (B.431) 


so that, for each y,, 


I|(@n — a) Wall S (en — 1) [Ilan — ll + Il — 1x) Il. (B.432) 


If we now restrict the yj, so as to lie in the convergent subsequence in question, then 
the right-hand side vanishes as n — o9: 


e Since |/é,|| = || 17|| = 1 we have || (en, — 177)|| < 2; 

e By construction we have lim, ||ay, — @|| = 0; 

e For any basis of H, and any 9 € H, we have limp ||(én — 17) || = 0 (although 
len — 1y|| fails to converge to anything if H is infinite-dimensional!). 


However, this contradicts (B.430). 


We use the notation of this proof to establish the fifth clause of Theorem B.130. 
By the sixth, the operator aj, = a*e, is compact, since any finite-rank operator such 
as é, is compact and a* is bounded. Therefore, ||a* —a*|| = ||a, —a|| > 0, so a* — a* 
and hence a* € Bo(#) by clause 4. 
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B.19 Spectral theory for self-adjoint compact operators 


If only to establish our notation, let us begin by recalling Theorem A. 10: 


Theorem B.132. Let dim(H) < © and let a: H — H be a self-adjoint operator. 
Then the eigenvalues A of a are real (collected in the point spectrum 6,(a) C R), 
the eigenspaces Hj, corresponding to different eigenvalues A are orthogonal, and 
we have the spectral resolutions 


a= ) A-e; (B.433) 
A€Gp(a) 

I~y= Ye, (B.434) 
2€ 6p (a) 


where ey is the projection onto the eigenspace 
H,={wedH |ay=Ay}. (B.435) 
This theorem is equivalent to the following alternative version: 


Theorem B.133. Let dim(H) < % and leta: H > H be a self-adjoint operator (i.e., 
a* =a). Then a is diagonalizable, in the sense that H has a basis (v;) consisting of 
eigenvectors of a. Furthermore, the eigenvalues A; of a are real. 


If a is diagonalizable, using the familiar notation ey, = |v;)(v;|, cf. (2.7), we write 


avi = AiVi3 (B.436) 
a=V hier, (B.437) 
iel 


To move from Theorem B.132 to Theorem B.133, pick some basis ( vo”) of each 
eigenspace H,. By Proposition A.8 we then have 


dim(H, ) 
a= ¥ [oly (B.438) 
k=1 
The totality of all vo, where A € o,(a) and k = 1,...,dim(H,) is our basis: 
relabeling this set as (v;), eq. (B.434) becomes ly = Y;|v;)(v;|, or W = YiciW; 
with c; = (v;, Y) for each w € H, which according to Theorem B.61.1 shows that 
(v;) is a basis of H (and hence i = 1,...,dim(H)). Furthermore, (B.433) yields 


ay = av”), or (B.436), so that each v; is an eigenvector of a. 


Conversely, for each A € 6,(a), assemble all eigenvalues A, that are equal to A 


and relabel those as ow, This yields e, through (B.438), and the above argument 


may be rerun in the opposite direction: the basis property of the (v;) implies (B.434), 
and the eigenvector property (B.436) yields (B.434) by verifying it on each basis 
(A) 


vector V; = v, ’, recalling that by construction, Ai=A. 
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We now adapt these results to infinite dimension. We still say that an operator 
a: H — H is diagonalizable if H has a basis (v;) consisting of eigenvectors of a. 


Proposition B.134. Let H = (7 (1) for some set I (i.e., H has a basis (v;) jer). Then 
some collection (A;)icr of complex numbers occurs as the set of eigenvalues of some 
bounded operator a € B(H) iff (Ai)ier is bounded, i.e., sup{|Aj|,i € I} < . 


Defining a function A : 1 > C by A(i) = Aj, we may express this as A € (J). 
Proof. If a € B(H) is diagonal in some basis (v;), with eigenvalues (A;), then 
[Ai] = ||Arvil] = llavill < [lel ill = lel, (B.439) 


for each i € J, whence the eigenvalues are bounded. Conversely, if they are, so that 
||A ||o < 00, take a basis (v;)ic7 of H, write y = Yc; 0; with Y; |c;|? <0, cf. Theorem 
B.61 and define ay = Y; Ajc;v;. Since 


YL lieil* < Alle Y led? = WAllellyll? <, (B.440) 


we have ay € H by Lemma B.59. These estimates also prove that ||ay|| < IA |]-0| yl, 
so that a is bounded, with ||a|| < ||A||.. (in fact, equality holds here). 


This characterization of bounded diagonalizable operators by a property of their 
eigenvalues may be considerably sharpened for self-adjoint compact operators. 


Theorem B.135. Let dim(H) = -, and let a € B(H)sa. Then a is compact iff it is 
diagonalizable with 4 € £o(1), in which case the sum in (B.437) converges in norm. 


We recall that some function f : 7 > C is in £o(/) if for each € > 0 there is a finite 
subset Je C J such that | f(i)| < € for all i ¢ Ze. If J = N (and in fact the proof below 
will produce this labeling of the basis), then the condition A € ¢o(N) just means that 


lim A, = 0. (B.441) 
n—-co 


Before proving this, we state the infinite-dimensional analogue of Theorem B.132: 


Theorem B.136. Let dim(H) = ~ and let a be some bounded self-adjoint operator. 
Then a is compact iff it has the properties stated in Theorem B.132, amended by 
the following clarifications and addenda (cf. Definition B.6, where X = 6, (a)): 


I. The sum in (B.433) converges in norm; 
2. The sum in (B.434) converges strongly, i.e., for each w € H we have 


v= ) ey; (B.442) 


AEG, (a) 
3. IfA € o(a) and A #0, then A € 6, (a) and dim(H,) < 9; 
4, Always 0 € o(a), and o,(a) C R has 0 as its only accumulation point. 


The equivalence between Theorems B.135 and B.136 is a bit more subtle than in 
finite dimension, but the key to the proof of both is the following lemma. 
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Lemma B.137. A compact self-adjoint operator a has an eigenvalue A = +|\all. 


Note that by definition of the operator norm, one always has |A| < ||a||, whether or 
not a is compact, but the point about compact self-adjoint operators is firstly that 
they have an eigenvalue at all, and secondly that the above equality is saturated. 


Proof. We use the fact that the norm y +> ||y|| is continuous on H, see (B.5), so 
that it attains a maximum on the compact set a(H<;). Assume that this maximum is 
attained at ayy, with ||y|| = 1. By definition of the operator norm, this maximum 
must be ||q||, so that ||a||* = ||ay;||?. Cauchy-Schwarz and a* = a then yield 


lla? = (ayi.ay1) = (vi,a-y1) < ||valllla’yal| < lla"l| = lal’, (B.443) 


where we have used (C.2). In the Cauchy—Schwarz inequality (A.1) one has equality 
iff either v = 0 or w = zv for some z € C, so that we must have a7y, = zy, with |z| = 
||a||?. Moreover, z € R, as eigenvalues must be real (which trivially follows from 
a* =a, one does not even need Theorem B.93 here), so aw = rw, with either 
A =|lal| or A = —llal|. If aw; =A, we are ready. If not, then 7; =ay; —Ay, £0, 
in which case ax, = a7, —Aay, = A2W,— Aa, = —AN. 


Corollary B.138. A compact self-adjoint operator is diagonalizable. 


Proof. Using the notation of the above proof, we call the (normalized) eigenvec- 
tor in question v; (so either vj = YW or V; = 71). Note that if (@,v,) = 0, then 
(a@, V1) = (9,a* v1) = (@,av1) = £A(@, v1) = 0, so that a maps the orthogonal 
complement vj = {@ € H | (v1, @) = 0} of v; into itself. This implies that a com- 
mutes with the projection e; onto ome 1.e., €;a = ae, and hence also eja = eae), in 
which the right-hand side is essentially the restriction of a to iT =e)H. 

By Theorem B.130.6, the operator e;a is compact, like a itself, and it is also 
self-adjoint. If e1a = 0 we are ready, since v; plus any basis of e;H is a basis of 
Hi that diagonalizes a. If not, we apply Lemma B.137 to the operator e;a, finding 
an eigenvector D2 with nonzero eigenvalue Az. A simple computation shows that 
€1V2 = UV, so that v2 € eH, from which we infer, in turn, that av2 = An v2. 

So we have found two basis vectors (0), 02) of H that are eigenvectors of a. 
The above procedure may then be iterated: we define e2 as the projection onto the 
orthogonal complement of v; and v2, and consider e2a. If e2a = 0 we are ready; if 
not, we find a third eigenvector of e2a and hence of a in e2H, et cetera. 


e If H =C’ is finite-dimensional, this procedure terminates after n steps, leaving 
a basis {01,...,D,} of H that by construction consists of eigenvectors of a. 

e If A is separable, the iteration procedure may be continued countably many 
times, leading to an ordered countable set B = (v1, U2,...) of orthogonal unit 
vectors that are eigenvectors of a. By construction we have |Ay| > |Aw+1| for all 
N €N, and hence there are two scenarios: either eya = 0 for all N > No E N 
(with eva £0 if N < No), in which case a =0 on (01,..., UN) or all |Ay| > 0. 

e In general, consider the set of all orthonormal sets in H that consist of eigenvec- 
tors of a. This set is nonempty by the argument above, and is inductively ordered 
by inclusion, so by Zorn’s Lemma it must have a maximal element B. 
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By Theorem B.61.5, the set B is a basis of H iff B+ = {0}. To show that this is the 
case, suppose B+ is a nonzero Hilbert space. Define f as the projection on H with 
image B+ and consider the self-adjoint compact operator fa. If fa = 0, there is at 
least one eigenvector of a in B4 = fH (namely, with eigenvalue zero), which is a 
contradiction. If fa 4 0, then a has an eigenvector by Lemma (B.137), and again a 
contradiction has been found: for in all three cases, by construction all eigenvectors 
were already contained in B‘+ = span(vj,...)~. 


Even if H is non-separable, the image of a compact operator a must nonetheless 
be separable. Therefore, the non-zero eigenvalues of a form a countable set, and 
the eigenvalue zero (which, by the same token, must occur in the non-separable 
case) has some uncountable multiplicity (in sharp contrast to which, each nonzero 
eigenvalue has finite multiplicity). Also in the separable case, the only eigenvalue 
that may have infinite multiplicity is zero (though in the separable case it does not 
necessarily occur). Theorem B.135 is now a consequence of the following lemma: 


Lemma B.139. A diagonalizable operator a is compact iff A € (I). 


Proof. In view of the proof and subsequent comment above, we may as well as- 
sume that J = N. For any y € H, the sum in (B.214) converges, so we must have 
lim, (Un, YW) = 0, or, in other words, v, > 0 weakly. If a € Bo(H), then av, — 0 in 
norm by Proposition B.125, and hence A, — 0, ice., Le fo(N). Conversely, if this 
holds, then for each € > 0, the set I; = {n EN: |A,| > €} is finite. This implies that 
the operator dn = Vines, . Amv» has finite rank. Since |Aj,| < € whenever m ¢ Ij jn, 


I@n—a) VIF =|] VE Ameon WII? SY [Am (Om, WPI? < e7llylI?,  (B.444) 
mEL; jy mE jy 


where in the last step we also used (B.213). Hence a, — a in norm, so that a is 
compact by Proposition B.131. 


To finish the proof of Theorem B.135, we show that the sum in (B.437), which for 
general bounded diagonalizable operators converges strongly, in fact converges in 
norm. To put this in perspective, eq. (B.437) with a = 1y reads 


ly =) ey,. (B.445) 


iel 


If / is infinite, this sum cannot converge uniformly: e.g., if we take J = N, then 


N N 
Jim, ln Qiew = finn ¥— 2, (ne) vet] (B.446) 
cannot be zero, as shown by taking y orthogonal to all v1,..., Vy. However, by 


Theorem B.61.1 the sum does converge strongly (i.e., applied to each fixed y). 
This seemingly special case even yields strong convergence of the sum in (B.437) 
for general diagonalizable bounded operators a, for by continuity of a we have: 


B.19 Spectral theory for self-adjoint compact operators 615 


ay =a) (vi, w) vi = Yo (vi, wav; = VAi(v;, y)vi = Yo Aiev,w. (B47) 
ic] ie] iel ie] 

If a is compact, strong convergence of (B.437) may be strengthened to norm conver- 

gence. The argument is analogous to the proof of Lemma B.139, but for complete- 

ness and contrast we now present it for general J. Since A € £9(J), for given € > 0 

there is a finite set J; C J for which |A;| < € for all i ¢ J,. For fixed y € H, we have 
2 2 

= <@° PV |(vi,w)|? <e7|lwll?,  (B.448) 

ide 


(a- - Kiev; ) V 


ile 


y Kiev, W 


idle 


so that |lJa— Vic), Aiev;|| < €. By Definition B.6, eq. (B.437) holds in norm. 


This analysis by no means contradicts Corollary B.104, including (B.327): ap- 
plied to compact operators, exactly one of the subsets Aj, C O(a) contains o(a)MUp, 
where Up is some neighborhood of 0 € o(a), so that the corresponding projection 
eéa,, 18 infinite-dimensional and all the other e,, are finite-dimensional. Thus the sum 
Yea; in (B.327) takes a rather different form from either the sum )); éy, in (B.445) 
or the sum )°y e, in (B.434); see also the end of this section. 

We now prove Theorem B.136. First, as soon as dim(H,,) = for some A 4 0, 
then A ¢ fo(1). Therefore, dim(H,) < co by Theorem B.135. In fact, is is easy to 
show directly that dim(ker(a—/)) < ce for any a € Bo(H) and A # 0: since a is 
bounded and hence ker(a— /) is closed, the latter is a Hilbert space in its own right, 
so if it were infinite-dimensional, any basis (u,) of it would have the property that 
Un — 0 weakly and hence au, — 0 in norm (cf. the proof of the above lemma). But 
aun = Aun, so that (au,) cannot converge in norm as soon as A #0. 

Second, take 0 4 A € o(a). According to Theorem B.93, in order to prove that 
A € o,(a), it suffices to show that ran(a — 2) is closed. We may assume that A 4 A; 
for all i € J (for otherwise, trivially A € 6,(a)), which implies ker(a — A) = {0}. 

Let y,, = (a—A)@, € ran(a—A), with @, 40 for all n, and suppose y,, > yw. We 
prove that (@,) is bounded. If not, then ||@,|| — 0», but since (@/,) is bounded, with 
9}, = Qn/||n||, and (Wn) converges, we have (a—A)@) = Wr/||@n|| + 0. Now a is 
compact, so (ag;,) has a convergent subsequence, which together with the previous 
result implies that (@/,) itself must have a convergent subsequence (as A # 0), say 
to g’. Continuity of a gives (a—A)g’ =0, hence 9), € ker(a— A) = {0}. But this 
is impossible, as ||@/,|| = 1 for all n. Thus knowing that (@,) is bounded, once again 
using compactness of a, we infer that (a@,) has a convergent subsequence. Now 


@n =A" (An — (4A) Gn) =A7| (4Gn— Vn); (B.449) 


and since (Y,,) converges by assumption, this implies that (@,) has a convergent 
subsequence, say with limit @. Continuity of a then implies that 


w= (a—A)@ €ran(a—A), (B.450) 


and hence ran(a — 1) is closed. Therefore, A € 6,(a). 
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To show that 0 € o(a), assume that a were invertible (which is to say that 
0 € p(a)). Then its inverse a~! would be bounded, so that a~!a = 1y € Bo(H) 
by Theorem B.130. But this is impossible in infinite dimension: a similar argument 
to the one below (B.445) shows that 14 cannot possibly be approximated by finite- 
rank operators. The last claim of Theorem B.136 is the same as A € fo (1). 


Here is a nice example of compact operators, also justifying the notation Bo(H). 


Corollary B.140. Let H = ¢?(N) and for f € €°(N), define the multiplication oper- 
ator my as usual, i.e, mpW = f. Then my is compact iff f € £o(N). 


Proof. This follows from Theorem B.135, where the label set is J = N, the basis 
(vi)icr is (bn )nen, Where 6,,(72) = dj as usual, m € N, and the eigenvalues are 


An = f(n), (B.451) 


since obviously mf6n = f 6, = f(n)6,. We already know from (B.276) that o(m,) = 
ran(f)~, which for f € £o(N) equals ran(f) if 0 € ran(f), and 


ran(f)” =ran(f)U {0}, (B.452) 


otherwise. In the first case, o(mp) = O,(my) = ran(f), so o-(my) = 0, whereas 
in the second case we have 6, (my) = ran(f) and o,.(my) = {0}. This also shows 
that in clause 4 of Theorem B.136, both possibilities 0 € o6,(a) and 0 € 0, (a) may 
occur, depending on a. Finally, the condition de (1), which in the example a = my 
reduces to (B.441), is just a restatement of the condition f € fo(N). 


In the continuous case, for H = L*(X), say for some connected open set X C R” 
with Lebesgue measure, the multiplication operator my defined by a function f € 
Co(X) is never compact, cf. (B.276); it is the very opposite of a compact operator! 

To close, in our (traditional) proof of Theorem B.136 we did not use the pow- 
erful spectral Theorem B.94. If dim(H) < o, Theorem B.132 indeed follows from 
Theorem B.94: if, for A € R, we define 144} = 6, :R > C by 64 (x) = 6),, then 


idg(a)= ), A-&; (B.453) 
A€Gp(a) 

let = ) &- (B.454) 
A€O>p(a) 


Now define e, = 6, (a). Then (B.290) - (B.291) give ey =e) = ej, So that ey is 
a projection. Furthermore, since idg (a) 6, =A + 6,, eq. (B.290) gives ae, = Ae,, 
so that e,H C H,. Applying the map f +> f(a) to (B.453) - (B.454) then yields 
(B.433) - (B.434), from which the equality e, H = Hy follows a fortiori. 

If dim(H) =~ and a € Bo(H)sa, this still works for each nonzero A € o,(a), and 
since the sum (B.453) converges uniformly in C(o(a)), we obtain (B.433) in the 
same way, including its norm-convergence. Unfortunately, even if we replace o, (a) 
by o(a), as we should, eq. (B.454) now fails, even pointwise, so that (B.434) still 
requires the kind of proof we gave (or a complicated argument based on (B.327)). 
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For finite-dimensional H the trace was defined by (A.77). There are (at least) two 
difficulties in generalizing this expression to the infinite-dimensional case in the 
naive way. First, not every operator has a finite trace; for example, take a = 1y, so 
that Tr (147) = dim(H). Second, Lemma A.25 is no longer valid in general: it is easy 
to find an operator a € B(H) and bases (v;) and (v/) of H for which 


x V;,aV;) ay v},av!) 


typically because one of these expressions converges, whereas the other diverges. 
For example, take a = Y;(—1)"|v;) (v;| as a strong limit, i.e., ay =Y,(—1)!(v;, w) v;; 
this lies in H by Theorem B.61, from which (B.214) shows that ||ay|| = || y||. Take 
vi = (v1 + V2)/V2, vf = (V1 — 02)/V2, 05 = (03 + 04)/V2, V4 = (V3 — 04)/V2, 
etc. Then )';(;,av;) = Y;(—1)' diverges, whereas )';(v/,av/) = Y;0 =0. 

However, if a € B(H) is positive, i.e., a > 0 in the usual sense that (y,ay) > 0 
for each y € H, then we will show that for any two bases (v;) and (v/) of H, 


Y\(vj,av;) =) (vj,av;) (B.455) 


L U 


where both sides may be infinite. Equivalently, (A.79) is valid, since any unitary 
operator defines and is defined by a basis transformation. To prove (B.455), we 
need a very useful construction of independent interest, cf. (A.73). 


Lemma B.141. Any positive operator a € B(H) has a (unique) square root, i.e., a 
positive operator \/a € C*(a) that satisfies a’ =a. 


Proof. This follows from Theorem B.94, since if a > 0, then o(a) C R*, and hence 
/- is defined on o(a). Alternatively, one may use the following construction due 
to the Dutch mathematician C. Visser (which is a special case of the approach just 
mentioned). If necessary, first rescale a so that ||a|| < 1, take the power series for 


VI=x= ¥ et, (B.456) 


k>0 


(in which fo = 1), which converges absolutely for |x| < 1, and put 


Va= YK (ln —a). (B.457) 


k>0 


As in the numerical case, squaring the series and rearranging terms yields va =a. 
Since uniqueness will not be needed, we omit the proof. 


For a > 0, we now use (B.215) to compute 
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Yi (v;,av;) = Vi (Vav;, Vani) = VP) (Vav;, v5) (vj, Vav;) 


i i ij 


=) (Vavj, vj) (vj, Vav}) = P°(vi,av}), (B.458) 


inj J 
where each term in every sum is positive, so that rearrangements are valid. Let 
B(H)+ = {a € B(H) | a> 0}; (B.459) 
In view of (B.458), we have a well-defined map 


Tr : B(H)+ — (0, oJ; (B.460) 
Tr(a) = )°(vj,a0;), (B.461) 


L 


where (v;) is an arbitrary basis of H, of which the result is independent by (B.455). 
To drop the restriction a > 0 in the argument of the trace, for any a € B(H) we 
note that a*a > 0, so that we may define the absolute value |a| of a by 


la| = Vata. (B.462) 


Then |a| > 0 for all a by construction, and if a > 0, then |a| = a. Finally, we define 
the set of trace-class operators in B(H), later seen to be a Banach space, as 


B,(H) = {a € B(A) | Tr(|a|) < oo}. (B.463) 
The trace-norm of a € B,(H), which for now is just a formula, is given by 
lla||1 = Tr(lal), (B.464) 
Lemma B.142. /. For any a € B,(H) we have 
Ila|| < lal]. (B.465) 
2. Any trace-class operator is compact, i.e., B\(H) C Bo(H). 


3. For b € B(H) and a € B,(A) one has (A.100), i.e., |Tr(ab)| < ||al|1||5]]- 
4. The trace-class operators B,\(H) form a vector space with norm (B.464). 


Part 4 will shortly be improved to B,(H) actually being a Banach space. 
Let us note that Lemma A.28 and Proposition A.29 on the polar decomposition 
remain valid for infinite-dimensional Hilbert space, with essentially the same proof. 


Proof. 1. By definition of the operator norm (B.227), for every € > 0 there is a 
unit vector y € H such that for any b € B(H) one has ||b||? < ||bw||? + (proof 
by contradiction). Put b = (a*a)!/*, and note that ||(a*a)!/*||? = |||a\|| = |la|| by 
(C.2) and (A.93). Completing y to a basis (v;), and noting that 


Yi ii(a%a)!/*0;|? = Yo((aa)!/*0,, (a*a)!/*0;) = 0 (vi, |a|v;) = llalli, (B.466) 


i i 
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lla| = ||(a"a)'/4|? < ||(a*a)'/4y\? +e < ye \l(a*a)!/* |)? +e = llalli +e. 
Since this holds for all € > 0, one has (B.465). 
2. Let a € B,(H). Since ¥;;(v;, |a|v;) < °°, for each € > 0 we can find n such that 
Visn (vi, |a|v;) < €. Let e, be the projection onto the linear span of {v;}i=1,....n- 


Using (C.2) in the form ||a||? = ||aa*|| (which is valid by (A.22)) and (B.465)), 


1/2412 
PU? = Ilen alenll < len lalen lt =) (vi,en lalen vi) = Yo (vi,lalvi) <e, 


i i>n 


llen lal 


for |(e7|ale+)| = e+|ale+, for if c > 0 then b*cb > 0 for any b,c € B(H). Since 
e+ = 1—ez, it follows that e,|a|!/? > |a|!/? in the norm topology. Since each op- 
erator ¢,,|a|'/? obviously has finite rank, |a|!/? and hence |a| is compact. Finally, 
a has polar decomposition a = u|a| and Bo(H) is a two-sided ideal in B(H). 

3. We just showed that a is compact. By Theorem B.130, also a*a is compact, and 
since it is self-adjoint, Theorem B.136 applies. This gives an expansion (A.101); 
although the sum may be infinite, this is no problem, as it is norm-convergent. 
Thus the computation will be analogous to the finite-dimensional case, cf. Propo- 
sition A.30, expect that we cannot use (A.78), which is valid but has not been 
proved yet. Fortunately, this problem may be obviated using (A.94). It follows 
from Lemma A.28 and Proposition A.29 that (v/ = wv;) also forms an orthonor- 
mal set, like the v; themselves, since the closed linear space spanned by the 
unit vectors 0; is just (ran|a|)~ and u is unitary from this space onto its image 
(rana)~. Taking the trace over any basis that contains the vectors v/, we compute 


[Tr (ab)| = |Tr(ulalu“ub)| = |)" pi(v;,ubv;)| 


< Yi pil(vj,ubv;)| < Y pillb|l|lell|| vil] = |lalli |||, B-467) 
where we used ||a||; =); pi, which follows from (A.101) applied to |al. 
4. Let a,b € Bi (A), and leta+b =u|a+Db| be the polar decomposition. Then 
lJa+b||, =Tr(u*(a+b)) =Tr(u*a) + Tr (ud). 
Applying (A.100) with ||u*|| < 1, one has ||a+ b||1 < |Ja||1 + ||b|]1. Hence By (7) 


is a vector space and || - ||; satisfies the triangle inequality. The other axioms for 
a norm are obviously satisfied. 


Proposition B.143. Let H = (°(N) (or even (7(X), for any countable set X), and for 
f € &°(N), define the corresponding multiplication operator mg by mp = fW, cf. 
Proposition B.73. We have seen that my is bounded, with norm (B.239). Then: 


my € Bo(H) iff f € £o(N); (B.468) 
my € Bi (A) iff f € €\(N); (B.469) 
Ilmplla = [Alla (B.470) 
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Here lo(N) consists of all f : N > C for which lim,_+. f (x) = 0. 
In particular, If dim(H) = © we have proper inclusions 


B,(H) C Bo(H) C B(H). (B.471) 


Proof. 1. For any a € B(H) we have a € Bo(H) iff |a| € Bo(#) by the polar decom- 
position (since a = u|a| and |a| = u*a and Bo(H) is a two-sided ideal in B(H)). 
In the present case, we have |m;| = ,/m-my = ,/Mm pz = mp, whence my € 
Bo(H) iff mp, € Bo(H). Since op(my)p)) = {| f(x)|,x € N}, part 6 of Theorem 
B.136 applied to a = mj) states that f € fo(N). 

2. This rapidly follows by computing Tr(|my|) = Tr(my,)) in the basis vy = 6x, 
x € N, where 6,(y) = 6,,, as usual. 


Proposition B.144. The map 


Tr : B\(H)>C; (B.472) 
at+ )"(v;,a0j), (B.473) 


L 


where (v;) is some basis of H, is well defined, (obviously) linear, and independent 
of the choice of basis. Furthermore, (A.78), i.e., Tr (ab) = Tr (ba), holds. 


Proof. Taking a = 1y in (A.100), we have |Tr(a)| < ||a||; < °° for a € By (A). In- 
dependence of the choice of basis follows by first decomposing a = a’ + ia", with 
a’ = 4(a+a*) and a” = —}i(a —a‘*) self-adjoint, as usual, and subsequently using 


Theorem B.132 to write a’ = a', —a'_, with 


d=2. } Ase, (B.474) 


: A€Op(a")NRs. 


and likewise for a”. This makes a is a linear combination of four positive operators, 
whence the claim follows from (B.458) and the obvious linearity of (B.473). 

To establish (A.78), we first note that Tr (au) = Tr (ua) for any unitary u; this is 
the same as (A.79), which has just been proved. The claim then follows from the 
following (generally useful) lemma. 


Lemma B.145. Any a € B(H) is a linear combination of at most four unitaries. 


Proof. By the previous argument, we may assume that a* = a, and for convenience 
we also assume that ||a|| < 1. In that case, ||aw|| < ||y|| and hence 1 —a* > 0, so 
that V1 — a? is defined, cf. Lemma B.141. Defining the two operators 


uz =ativ 1—a’, (B.475) 


we find ui.u4+ =usu, = 1y, making each uy unitary, anda = }(u,+u_).Ifafa*, 


the number of terms at most doubles. 


The deeper significance of the trace-class operators now emerges. 


B.20 The trace 621 


Theorem B.146. For any Hilbert space H, we have dualities and double dualities 


Bo(H)* = Bi (H); (B.476) 
B,(H)* ~ B(H); (B.477) 
Bo(H)* & B(H): (B.478) 
B,(H)* = B(H)*, (B.479) 


where the symbol = stands for isometric isomorphism. Explicitly: 


e Any norm-continuous linear map @ : Bo(H) —> C takes the form 
@(b) = Tr (ab), (B.480) 


for some a © B,(H) uniquely determined by @, and vice versa, giving a bijective 
correspondence between @ € Bo(H)* and a € B,(H) satisfying 


|||] = |la|l1. (B.481) 


This equality remains valid if @ is regarded as an element of B(H)* via (B.479) 
and the isometric embedding B\(H) — B,(H)** (cf. Proposition B.44). 
e Any norm-continuous linear map x : B\(H) — C takes the form 


x(a) =Tr (ab), (B.482) 


for some b € B(H) uniquely determined by x, and vice versa, giving a bijective 
correspondence between x € B,(H)* and b € B(H) satisfying 


IIx] = lle Il. (B.483) 


Proof. It is clear from (A.100) that B)(H) C Bo(H)*, with ||@|| < ||a||1. For the 
opposite direction, we return to the projections e, in the proof of part 2 of Lemma 
B.142. Taking the trace over the basis (v;), we have 


lal], = Tr ({a|) = limTr (e,|a|e,) = lim Tr (e, |a|) = lim Tr (e,u*a) 
n n n 


=lim @(enu"); (B.484) 
n 


since @(e,u*) > 0, we have @(e,u*) < ||ol|||enu*|| < ||@||, whence |la||1 < ||o|| 
(note that the limiting procedure is necessary here, since @(u*) would not be defined 
because typically u* is not compact). This proves (B.481). 

To prove (B.476), it remains to be shown that every @ € Bo(H)* can be repre- 
sented as (B.480). Noting that Bo(H) is the norm-closure of the linear span of all 
operators of the sort a =|w)(@|, where y, @ € H are unit vectors, the functional @ 
is determined by its values on those operators. Given @, we define a by its matrix 
elements (@,ay) = @(|Y) (@|). Evaluating the trace on a basis containing @ yields 
Tr (aly) (@|) = (@,aw) and hence gives (B.480) on operators a of the said form, 
upon which the general case follows by continuity. 


622 B Basic functional analysis 


We now prove (B.477). As in the previous case, the inclusion B(H) C B,(H)* 
is clear from (A.100), as is the inequality ||7|| < ||a|]. This time, the proof of the 
opposite inequality uses a = |) (|, in which case one easily obtains 


II) Coll = {ville (B.485) 


which in the case of unit vectors equals unity. Assuming (B.482), this gives 
Ix (b) 


Combined with (B.228), this gives ||b|| < ||7||, and hence (B.483). 

Finally, as in the previous case, given 7, we find b though its matrix elements 
(9, bw) = x(|W) (|), which gives (B.482) on the special trace-class operators de- 
fined by a = |w) (@]|. Noting that the linear span of such operators in dense (in the 
trace-norm) in B,(H), once again this gives the general case by continuity. 


= IX(1¥)(Pl)| = [Tr (ly) (1)| = (@,bW)] < Milly (ell: = [Ill 
(B.486) 


Corollary B.147. 7. The vector space B\(H) is complete in the norm (B.464). 
2. B1(H) is a two-sided ideal in B(H) (a € B(H),b € B\(H) = ab € Bi (HA) 5 ba). 


Proof. The first claim follows from (B.476) and the completeness of Bo(H)* (cf. 
Theorem B.33 and §B.9). The second follows from (A.100) and (A.78). 


This actually reveals a subtlety in (B.471): as a normed space, Bo(H) simply inherits 
the norm of B(H), in which it is complete. Clearly, Bj (H) also inherits the norm of 
B(#), but that is the wrong one: firstly, Bj (H) is not complete in the operator norm 
(indeed, its completion is Bo(H)), and secondly, the operator norm is the wrong one 
for the fundamental dualities stated in Theorem B.146. 

The following trace-class operators occupy the center stage in quantum theory. 


Definition B.148. A density operator is a positive operator p € B\(H) such that 
Tr(p) = 1. (B.487) 


Equivalently, p is a density operator iff it has a norm-convergent expansion 


p= Y¥° Ave, (B.488) 
A€0p(p) 


where 6,(p) is some countable subset of R* with 0 as its only possible accumula- 
tion point, the multiplicity m, = dim(H) ) of each eigenvalue A > 0 is finite, and 


yo A-m, =1. (B.489) 
A€op(A) 


Similarly, (2.6) holds just as in finite dimension, i.e., (B.488) is equivalent to 


p=) pilvi) (vil, (B.490) 
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where (v;) is a basis of H, and the coefficients (p;) satisfy p; > 0 and Y; p; = 1. 
Furthermore, the p; have 0 as their only possible accumulation point and are such 
that each t > 0 occurs in the set {p;} at most finitely many times. Like (B.488), also 
the equivalent expansion (B.490) is norm-convergent by Theorem B.136. 


Definition B.149. Let H be a separable Hilbert space. An operator a € B(H) is 
called a Hilbert-Schmidt operator if for some (and hence any) basis (v;) of H, 


Lllavill? <=, (B.491) 


We write Bz(H) for the set of all Hilbert-Schmidt operators on H. 


The argument that the sum in (B.491) is independent of the basis is based on (B.215) 
and is analogous to the computation (B.458), thjis time even without the compli- 
cation of the square root, for we simply have Y;||av;||? = Y,(av;,av;), etc. For 
a € Bo(H), with foresight we define the expression (where (v;) is any basis of H): 


1/2 
lla|l2 = /Tr (a*a) = (Eiewi (B.492) 


Theorem B.150. Let H be a separable Hilbert space. 
1. For any a € B(H) we have 


all < llalls < lla. (B.493) 
2. Every Hilbert-Schmidt operator is compact, and refining (B.471) one has 
B,(H) C Bo(H) C Bo(A). (B.494) 
3. The Hilbert-Schmidt operators By(H) form a Hilbert space with inner product 
(a,b)2 = Tr(a*b), (B.495) 
and a Banach space in the ensuing norm (A.2), which equals (B.492). Clearly, 
Bo(H)* = Bo(A). (B.496) 


4. The Banach space Bx(H) is a two-sided *-ideal in B(H), and if a € B2(H) and 
b € B(H) we have |\ball2 < |b||lall2 and |lab|l2 < |\bllall 


Proof. 1. Take any unit vector y € H and complete it to a basis of H. This gives 
||aw|| < |la||2. Taking the supremum over all such y gives the first inequality. 
The second one will be proved in the next item. 

2. With e, from the proof of Lemma B.142.2, for a € B2(H) we define a, = aén, 
and note that because Y;||av;||? converges, ||(@ — an)||5 = L241 |lavil|? > 0. 
By the previous item, ||a, — a|| + 0. Since a, € Bo(H), by Proposition B.131 
also a is compact . For the second inequality in (B.493), Theorem B.136 yields: 
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2 
llall3 = oui s (Eve) = |lall?, (B.497) 


where the p14; > 0 are the eigenvalues of the positive compact operator a*a; the 
eigenvalues of the compact operator |a| = Va*a are (,/[j). 
. We first show that B2(H) is a vector space. For any a,b € B(H) we have 


2(a*a+b*b) = (a+b)*(a+b)+(a—b)*(a—b), (B.498) 


so that (a+b)*(a+b) < 2(a*a+b*b) and hence ||a + b||} < 2((\al]2 + ||b|)3. 
Therefore, if a,b € B(H), then a+b € Bz(H). Since ||Aal|2 = |A||lall2, it is 
clear that if a € B2(H), then Aa € Bo(H). Hence B2(H) is a vector space. 
Furthermore, because of the identity 


3 
ab=1V) i*(b+ia)*(b+ia), (B.499) 
k=0 


the inner product (B.495) may be rewritten as 


3 
(a,b)2 =) (ei,a*be}) =4h # (b+ Falls, (B.500) 


i k=0 


which shows that if a,b € B2(H), then (a,b). < c0. This reconfirms the fact that 
the trace in (B.495) may be computed in any basis, since this is true for each term 
on the right-hand side of (B.500). Sesquilinearity of (B.495) is straightforward. 
To prove positive definiteness, we use part 1: if ||a||2 = 0, then ||a|| = 0 and hence 
a = 0, since we already know that || - || is a norm. 

Knowing that (B.495) is an inner product on Bz(H), it immediately follows that 
|| - 2 is anorm on B2(H), since, as already noted, ||a||2 = (a,a)o. 

Finally, to prove completeness, we pick a basis (v;) in H and note that B2(H) 
is the closure of the linear span of all operators of the form a = ¥; ;cij|V;) (vil. 
This is because of the continuity of the inclusion B(H) C Bo(H) (which is true 
because of part | and the fact that By(H) is itself the closure of this linear span). 
An easy calculation then gives 


lla||2 = I Deislve) (valll3 =Yleyl?. (B.501) 
ij 


ij 


Hence B2(H) is isometrically isomorphic to the space of square-summable se- 
quences (c;;) indexed by N x N, which by Theorem B.9 is complete in the e- 
norm ||c||} = Yj, |ci;|”. Hence B2(H) is complete, too. 

. From (A.78) (proved in Proposition B.144) we have Tr (a*a) = Tr(aa*), so that 
a © Bo(H) iff a* € Bo(H). If b € B(A) anda € Ba(H), then ||bav,|| < ||D||||av;| 
and hence ||bal|2 < ||b||||a||2, so ba € Bo(H), and hence also a* € By(H) and 
a*b* € Bo(H). Similarly, ab € B2(H), with ||ab]|2 < ||b]||Jall2. 
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B.21 Spectral theory for unbounded self-adjoint operators 


Although there is hardly any distinction between bounded and unbounded self- 
adjoint operators in so far as the definition and elementary properties of the spectrum 
are concerned (cf. Definitions B.80 and B.85, Theorem B.91, and Theorem B.93), 
extending the various versions of the spectral theorem to the unbounded case is a 
highly nontrivial matter. There are many ways of accomplishing this, among which 
our presentation has the virtues that firstly (in contrast to von Neumann’s original ap- 
proach based on the Cayley transform) we stay within the realm of self-adjointness, 
and secondly we preserve the C*-algebraic spirit of Theorem B.94. Thirdly, our 
treatment is sufficiently general to cover the two main applications in quantum me- 
chanics (viz. the Born rule and Stone’s Theorem). For those applications, setting up 
a functional calculus for bounded Borel functions suffices, but in order to state even 
the defining property idg(,) ++ a of the functional calculus also for unbounded a (cf. 
Theorem B.94), unbounded continuous functions will also have to be incorporated 
(but we refrain from a further generalization to unbounded Borel functions). 
Our approach starts from the observation that (with slight abuse of notation) 


y:R- (-1,1); (B.502) 
y(x) = ae ee (B.503) 
lx) = x(1—2?) 1”, (B.504) 


provides a homeomorphism R & (—1, 1). This has an operatorial counterpart 


av a(ly +a’)! =; (B.505) 
bw (ly —b*)7!/ =a, (B.506) 


where the notation for the square roots should be carefully disambiguated as 


(gt = (age: (B.507) 
Gab Sayer ye. (B.508) 


As we shall see, the operator (14 +a)! is bounded (and so is | —b*b), of course), 
so that square roots are only taken of bounded operators, in which case they are 
defined by Lemma B.141. As in the numerical case (B.503), the correspondence a +> 
b in (B.505) - (B.506) will turn out to be bijective, mapping the class of (possibly 
unbounded) self-adjoint operators into the class of self-adjoint pure contractions: 


Definition B.151. A pure contraction is a bounded operator b: H — H for which 


lbwll < wll (y € H\{0}). (B.509) 


If b is in addition self-adjoint, this is equivalent to ||b|| < 1 and ker(b+ 1y) = {0}, 
ie., +1 ¢ 0,(b); the argument is similar to the proof of Lemma B.137. 
Eqs. (B.505) - (B.506) form a special case of a more general correspondence. 
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Theorem B.152. The formal expressions 
b =a(1y +a*a)!/? Sa((1y +a‘a)!)!/?; (B.510) 
a = b(1y —b*b)!/? = b((1y —b*b)'/?)“}, (B.511) 


make rigorous sense and define a bijective correspondence between the class of 
closed operators a (with dense domain) and the class of (necessarily bounded) 
pure contractions b. This correspondence preserves the adjoint, in that 


b* =a*(1y +aa*)~'/?; (B.512) 
a* = b*(1y —bb*)“!”, (B.513) 


and hence specializes to a a bijective correspondence (B.505) - (B.506) between 
self-adjoint operators a and self-adjoint pure contractions b. 
The (bounded) operator b is called the bounded transform of a. 


Proof. 1. Fromb to a. If bis a pure contraction, then 14 —b*b > 0, since this means 
(y,b'by) < (Wy), (B.514) 
or |/by||? < ||w||?. Furthermore, 1;; — b*d is injective, since (14 — b*b)y = 0 
implies ||y||? = ||bw||*, contradicting (B.509). This implies that (1,7, — b*b)!/? 
is injective, as (14 —b*b)'/2y = 0 implies (1 — b*b) w = 0 and hence y = 0. 
Thus the inverse (B.508) exists, with domain 
D((1q —b*b)~/?) = ran((1y —b*b)!/?). (B.515) 
This domain in dense in H, since for any c € B(H) (which in our case is c = (ly — 
b*b)'/2) we have H = ker(c) @ker(c)+; for c* = c we have ker(c) = ran(c)+ 
and hence ker(c)+ = ran(c)~, so that injectivity of c yields H = ran(c)~. Hence 
(B.511) is well defined on 
D(a) = ran((1q —b*b)'/”). (B.516) 
To prove that a is closed, we write a = bc—!, as above, and note that 
G(a) = {(by, cw), w € H} =ran(v), (B.517) 
where v: H > H @H is obviously defined by vy = (by, cy). Hence 
llvyll? = lbw? + ley? = [lol + a — 5%)? wl? = ||P, (B.518) 


so that v is an isometry. As such, ran(v) = G(a) is closed. 
2. From a to b. By definition, D(1y +.a*a) = D(a*a), with 


D(a*a) = {w € D(a) | a*w € D(a)}. (B.519) 
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We show that ly +-a*a: D(a*a) > H is bijective. First, (B.237) implies 

H®H =G(a)@G(a)* = G(a) GuG(a"), (B.520) 

so for any (Wi, 2) € H @H there are unique @ € D(a) and x € D(a*) such that 


Wi =9-a'x: (B.521) 
Wo =apty. (B.522) 


In particular, for (Wi, YW) = (w,0) we obtain 
v = (ly +a*a)9, (B.523) 


This shows both surjectivity and injectivity, since @ is uniquely determined by 
y. Consequently, the inverse 


(ly +a*a)~!: H — D(a*a) (B.524) 
exists as a linear map, and since 


| +a*a) "|| = |l9l| < |x +a*a)g|| = (ly), (B.525) 


we see that (1 +a*a)~! is bounded, with ||(147+a*a)~!|| < 1. A similar argu- 
ment shows that (1 +.a*a)~! is positive: 


(W, (1 ta*a)'y) = ((la +a*a)9,9) =|lo|/? +|lag||? >0, — (B.S26) 


so that the square root (B.507) exists. As before, injectivity of (1 +a*a)~! 
implies injectivity of its square root, whence ran((1q + a‘a)—'/ >) is dense in H. 
Clearly, (1y +a*a)~!/? maps ran((1y +a*a)~!/) to 


ran((1y+a*a)') = D(a*a) C D(a), (B.527) 


so that the operator b in (B.510) is defined on ran((1# haa V*): We now show 
that b is bounded on the latter: for any wy € H we have 
|] tata)" yl? = lla +a%a) wi? 
= ((Iy+a"a)!y,a"a(1n +a"a)-y) 
< ((lyta'a)!y, (In +a"a)(1y +a"a)!y) 
= (In +a"a) WW) =| +a%a)"yl/?, 8.528) 


1 


so that b may be extended to all of H by continuity, with ||b|| < 1. Still denoting 
this extension by b, we have 


b*b = (ly +a*a)|/a*a(1y +a*a)-'/? =1y—(1y+a‘a)!, — (B.529) 
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from which it easily follows that b is a pure contraction: for any y 4 0, we have 


bv? = (y,b* by) = Ilyl? — a ta%a) yl? <|ly?, (B.530) 


-1/2 1/2 


since ||177+a*a)~!/?y||? > 0 by injectivity of (1 +a*a) 

3. Bijectivity of the correspondence a ++ b. If a is determined by b according to 
(B.511), then 

ly +a*a= (ly —b*b)"!, (B.531) 


so that 
(1 —b*b)'/? = (1y +a*a)~'/?, (B.532) 


whence 
b = b(1y —b*b)~/? (1 —b*b)'/? = a(1y — bb)? = a(1y +a*a)- "7. 
(B.533) 
Similarly, if b is defined by a according to (B.510), then (B.529), rewritten as 
ly —b*b = (1p +a*a)"!, (B.534) 
reproduces (B.511). To see that the domains match, in view of (B.516) we need 


D(a) = ran((1qy +a*a)~'/?). (B.535) 


The inclusion D(a) D ran((1y +a*a)~'/) already having been established in 
step 2 above, we prove the opposite inclusion C. Indeed, for any y € D(a) we 
have 

v= (1y ta‘a)'/? (bat (ly +a*a)'/”)y, (B.536) 


where b is given by (B.510). This follows by taking inner products with @ € H: 


(9, (In +a'a)'?b*ay) + (9, (la +a*a)'y) 
= (a*a(1n +a*a) "9, y) +(9,(ln+a%a)'y) =(@,y). — (B.537) 


4. Self-adjointness. Since a is closed we have a** = a (cf. Lemma B.74), so using 
a* instead of a in part 2 above, we have 


(1y +aa*)~' : H > D(aa*) C D(a*), (B.538) 
bijectively. If, in addition, y € D(a*), we may compute 
a*w=a* (ly +aa*)(1q +aa*)7! w = (ly +a*a)a*(1y +aa*)~'y, (B.539) 
from which it follows that 
a*(1y +aa*)~'w= (ly +a*a)la*y. (B.540) 


Similarly, for any polynomial p in one real variable we have 
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a*p((1y +aa*)~!)w = p((Iy +a*a)!)a*y. (B.541) 
By Weierstrass, we can find polynomials p, such that 


lim pa((1-+x)~!) = Cae, (B.542) 


for any x > 0, also cf. the proof of Lemma B.141. Hence by Theorem B.94 and 
closeness of a* we obtain 


a (Iq +aa*)""/y = (Iq ta°a) "a" y = (a(n tata) "/?)*y 
=by, (B.543) 
for y € D(a*). Since the latter is dense, we have (B.512). Bijectivity of the cor- 


respondence a + b then also implies (B.513). In particular, a* = a iff b* = b, 
which implies the last claim of the theorem. 


Though not needed in what follows, it would be a pity not to state: 
Corollary B.153. [f a: D(a) > H is closed (with D(a)~ = H), then: 


1. 1g +a*a is self-adjoint on D(a*a); 
2. (ly +a*a)"! = 1 0€G(q) 0, where: 


e 4:H>H@H is defined by yy = (w,0); 
© g(a): H OH — A OH is the projection onto the graph G(a); 
e m™:HOH > His the projection 1 (Wi, YW) = Yq onto the first coordinate, 


so that in total we duly have ™ 0egq) 0: H > HA. 
3. The closure of a\piq*q) is a (in other words, D(a*a) is acore for a). 


Proof. 1. Part 2 of the proof of Theorem B.152 yields positivity and hence self- 
adjointness of (1 +a*a)~!. The claim now follows from the (easily established) 
fact that the inverse of an invertible self-adjoint operator is self-adjoint, too. 

2. The reasoning following (B.521) - (B.522) yields meG(a)ti(W) = @, where p = 
(1y +a*a)~|y by (B.523). Hence 


(la +a*a)megayti =n; (B.544) 
MeG(a)l(ly +a*a) = Ip. (B.545) 


3. This is a consequence of the fact that ran(1y + a*a) = H, cf. part 2 of the above 
proof, too. Indeed, we need to show that the graph of the restriction 


G(a\p(a*a)) = {(W,4ay), w € D(a*a)} (B.546) 


is dense in the grapg G(a) = {(w,ay), y € D(a)} within H @H. In other words, 
if y € G(a) satisfies (©, Y) Hen = 0 for each ® € G(apqa)), then y = 0. With 
w = (y,ay) and ® = (@,a@), where y € D(a) and @ € D(a*a), we obtain 
(®, W)Han = (14 +a*a)@,W)x, which indeed vanishes for each @ € D(a*a) 
iff y =0. 
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To get a feeling for the constructions to follow, we first look at the bounded case. 


Proposition B.154. [fa = a* is bounded and b is given by (B.505), then 
C*(a) =C*(b). (B.547) 


Furthermore, o(a) C Rand o(b) C (—1,1) (both included as compact subsets) are 
homeomorphic via the maps (B.503) - (B.504), preserving eigenvalues, that is, 


o(a) ={w(l—w?) "| we o(b)}; (B.548) 
o(b) = {A(1+A7)-!7 |4 € o(a)}: (B.549) 
6,(a) = {u(1—n7)-"/? | w € 0 ,(b)}; (B.550) 
Op(b) = {A(1+A7)-"/? | 2 € op (a)}. (B.551) 


Proof. By Theorem B.84 and Theorem B.93, o(a) C R and o(b) C [1,1] are 
compact. We now show that in fact o(b) C (—1, 1); in particular, +1 ¢ o(b). For if 
+1 € o(b), then b+ 1 is not invertible, so that, given that \/ 1 + a? is invertible, 
by (B.505) the operator \/ 1 +a? +a is not invertible. But since the function 


fe(x) = V14+x24x (B.552) 


is strictly positive on any compact subset of R, and 


Vlyt+@ta= fr(a), (B.553) 


the operator in question is invertible, with inverse f..(a)~! = (1/f)(a). Contra- 
diction. Having thus localized o(b), it follows that y~! in (B.504) is continuous 
on o(b), so that, with a = y~!(b), we have a € C*(b) and hence C*(a) C C*(b). 
Similarly, b = y(a) and hence C*(b) C C*(a), whence (B.547). 

Eqs. (B.550) - (B.551) for follows from the explicit construction of the square 
root in the proof of Lemma B.141: if cw =A y, then /cy = VA y. Likewise (more 
trivially), if c is invertible (whence A 4 0), then c~!' y =A~!w. The same result for 
the full spectra follows either from the spectral mapping property (C.53), or from 
the following direct argument. Given (B.547), Theorem B.94 yields an isomorphism 
C(o(a)) = C(o(b)) of commutative C*-algebras, since we have 


g(b)ag 
i. 


C(o(a)) 
Eqs. (B.548) - (B.549) then follow from the identities 


f(a) = (foy')(b), f € C(o(a)): (B.555) 
g(b) = (goy)(a), ge C(o(d)), (B.556) 


which in turn follow from Theorem B.94. 


C*(a) = C*(b) 


mil) C(o(b)). (B.554) 
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Now suppose a is unbounded. In that case, its bounded transform b remains 
bounded, but its spectrum contains at least one of the points +1. We abbreviate 


6(b) = 0(b) (1,1). (B.557) 


Proposition B.155. If a and b are as in Theorem B.152, their spectra are related by 


o(a) ={u(1—p?)-/? | we 6(b)}; (B.558) 
o(b) = {A(1+47)""7 |2 €o(a)}-. (B.559) 
op(a) = {u(1—p?)""/? | pw € 6,(b)}; (B.560) 
Op(b) = {A(1+A7)-!/? | A € oy (a)}. (B.561) 


If a is bounded this duly reduces to (and reproves) eqs. (B.548) - (B.551), since 
o(b)M(—1,1) = o(b), and the right-hand side of (B.559) is already closed in R. 


Lemma B.156. Let a = a* € B(H). Then the spectrum o (a) according to Definition 
B.80 coincides with the set o(a) in Definition B.81, where A = C*(a). 


Proof, We must show that if (a—2)~! exists in B(H), then its exists in C*(a) (in 
the double sense that (a—A)~! lies in C*(a) and is the inverse of (a—/) in C*(a)); 
the converse is trivial. Using Theorem B.94 as well as the obvious invariance of the 
spectrum (as in Definition B.81) under isomorphism, we might as well show that if 
(a—A)~! exists in B(H), then the function (idgq) — A)! exists in C(o(a)). This 
is the case, since, by definition of o(a), the antecedent holds iff A ¢ o(a). 


We apply this lemma with a ~ b in order to prove Proposition B.155. 


Proof, We know from (B.516) that \/ 147 —b? : H > D(a) isa bijection. If A € p(a), 
then both maps in the following diagram are bijections: 


ey 72 
AT pase. (B.562) 


and this is the case iff (a—A)o./1y —J? is invertible, which, using (B.505), is 
true iff b—A\/14 —D? is invertible. Hence A € o(a) iff b—AV1y —b? € C*(b) 
is not invertible in B(H), or, equivalently, in C*(b). Define g,(y) =y—-AV1y—y? 
in C(o(b)), so that g,(b) = b—AV/1y—b*. Theorem B.94 (again with a ~» b) 
then implies that A € o(a) iff gj is not invertible in C(o(b)), which according 
to (B.253) (with A = 0) is true iff 0 € ran(g,). Since gy(+1) = +1 40, even if 
+1 € o(b), these values play no role, so that 0 € ran(g,) iff A = w(1— 2)-2 for 
some  € o(b)M(=1,1). This yields (B.558) for o(a) and o(b). 

The claimed refinement to the point spectrum follows as in the proof of Proposi- 
tion B.154. The same argument shows that any u € o(b) 1 (—1,1) must come from 
A € o(a), and since o(b) must be a closed subset of [—1, 1], this gives (B.559). 


As an illustration, take a to be the position operator on H = L?(R), so that b = mf 
with f(x) =x/V1+.x. Eq. (B.276) then gives o(a) = R and o(b) = [—1, 1]. 
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If a is bounded, there are only two (commutative) C*-algebras to be concerned 
with in a spectral theorem 4 la Theorem B.94, viz. C(o(a)) and C*(a). In the un- 
bounded case, where o(a) C R is no longer compact, already no fewer than four al- 
gebras of continuous functions are associated with the spectrum, namely (cf. §B.3): 


the set C.(o(a)) of all continuous functions f : o(a) > C with compact support; 
the set Co(o(a)) of all continuous functions f : o(a) — C that vanish at infinity; 
the set C,(o(a)) of all bounded continuous functions f : o(a) > C; 

the set C(o(a)) of all continuous functions f : o(a) > C. 


Of these, the second and the third are commutative C*-algebras in the supremum- 
norm; the first fails to be closed in this norm, whereas the last does not carry it (as 
it would be infinite on any unbounded function). We have the obvious inclusions 


C-(6(a)) C Co(a(a)) C C,(o(a)) C C(o(a)). (B.563) 


Each of these plays a role in spectral theory (as do measurable versions of them). On 
the side of the bounded operator b, on top of C(o(b)), we have analogous function 
algebras, this time with inclusions 


C.(6(b)) C Co(6(b)) C C(o(b)) C Cy(G(b)) C C(G(b)), (B.564) 


since C(o(b)) consists of all functions g in C,(6(b)) for which limy_,1 g(y) exists, 
which limit is equal to zero iff g € Co(6(b)). Since y"! : (—1,1) > R in (B.504) 
restricts to a homeomorphism 6(b) > o(a) because of (B.558), the map 


C.(o(a)) 3 C.(6(b)), fro foy7!, (B.565) 


is an isomorphism for e = c,0,b, or blank (which is isometric for 0 and b). If f € 
Co(o(a)), as in (B.555) (but no longer assuming a to be bounded), we may define 


f(a) =(foy')(b), (B.566) 


since f oy~! € Co(G(b)), and in view of (B.564), the right-hand side is defined by 
the continuous functional calculus for b, i.e., g 4 g(b), where g € C(o(b)); the 
same is then true for f € C.(o(a)). Let the (typically non-unital) *-algebras 


Co(b) = {8(b) |g € Ce(G(B))}; (B.567) 
Co(b) = {a(b) |g € Co(G(b))}, (B.568) 


be the pertinent images under this calculus. In view of (B.568), we then have 


C*(b) C CX(b) C C*(b) C M(C%(b)) C M(C%(b)), (B.569) 


where M(Cj(b)) and M(C2(b)) are the multiplier algebras of Cj(b) and Cz(b), 
respectively, cf. §C.10. Note that M(Cj(b)) is a C*-algebra contained in B(H), 
whereas M(C%(b)) consists (partly) of unbounded operators (see below). 
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Lemma B.157. The (finite) linear span C*(b)H of all vectors of the form g(b)W, 
where g € C.(6(b)) and w € H, is dense in H, i.e., C:(b)H~ =H. 


This would be trivial for C*(b)H, since unlike C(b)H it contains the unit 1. 


Proof. Approximate 1 (,) pointwise by some monotone increasing bounded se- 
quence (f,,) with compact support, cf. Lemma B.97; for example, define 


Ja 2411) SR, (B.570) 
fr(x) =0 (x € (—1,-141/n],x € [1—1/n,1)); (B.571) 
f(x) =1 (x € [—-14+2/n,1—2/n)), (B.572) 


and linear interpolation elsewhere. As in (B.317), we then have f,(b) — ly strongly. 
By definition of Cz(b), this yields the claim. 


Theorem B.158. Let a be a (possibly unbounded) self-adjoint operator on H. 
1. For any f € Cy(o(a)), the operator f(a), initially defined by linear extension of 


f(a)oh(a)y = (fh)(a)y = ((fh) oy ')(b)y, (B.573) 
i.e., defined on the domain C;(b)H~ (cf. (B.565) with e = 0), is bounded, with 
IIf(@) Il < Ifllee (B.574) 
and hence extends from C}(b)H to all of H by continuity; we write 
f(a) = f(a)o- (B.575) 


2. The functional calculus f ++ f(a) from C,(o(a)) to B(H) thus established satis- 
fies the algebraic rules (B.289) - (B.291), and one has the reassuring cases 


Lo(a)(@) = la. (B.576) 


—————(a) = (a—z)! (z€p(a)). (B.577) 


Conceptually, what is going on here is that the homomorphism 


Co(o(a)) > B(H); (B.578) 
fr f(a), (B.579) 


as defined in (B.566), is extended to the multiplier algebra 
M(Co(o(a))) =C,(o(a)). (B.580) 


Theorem C.77 then applies, since by Lemma B.157 the initial homomorphism is 
nondegenerate, immediately yielding boundedness of f(a). Below we will also give 
an independent proof of (B.574). 
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Proof. The operator f(a) is densely defined by Lemma B.157 (which a fortiori 
implies that Cj(b)H is dense in H). To prove that f(a)o is bounded, take € > 0 
and hence find a compact subset K C R such that | f(x)h(x)| < € whenever x ¢ K. 


——_ 


Writing f = foy ! etc., using (B.322) with f ~» lx fh we obtain 


ll Txefh)(b) Wl) < |I(1xe Fh) (D)III|Vll < |1xeFhlleoll ll <ellyl|. — (B.581) 


From this, using also the homomorphism property in Theorem B.102, we then find 


l(fh)(a) yl] = lI(Fa)(o)y\| 
= ||(Ixfh)(b) + (fh Ix Fh)(b) y| 
< |\(IxFh)(b)yl| + || Axe Fh) (6) 
= |\(1xf)(b)h(b)y|| + || xe Fh)(b)y\| 
< ||(xf)llella(a) yl +ellv\l, 
< |lfll=lh(a)yl| + ely, (B.582) 
x Ale < [Fle = Ife. (B.583) 


Since the last expression in (B.582) is independent of K, we may let € — 0, obtaining 
boundedness of f(a) as well as (B.574). 

The second claim should be obvious from (B.566) and Theorem B.94. 

Eq. (B.576) is trivial. To prove (B.577), write f(x) = (x—z)~!, where z € p(a) 
is fixed and x € o(a). We have 


f (a)oh(a)y = (fh)(a)y = (a—z) thlayy, (B.584) 


and hence 
f(a)og = (a-z)"'9, (B.585) 


for any @ € D(f(a)o) = Co(b)H. So if %, — @ for 9 € H and , € D(f(a)o), 
boundedness and hence continuity of the operator (a—z)~! implies 


f(a) = lim f(a)o@n = lim (a—z)' Gn = (a2) '9. 


To construct a (typically unbounded) operator f(a) for f € C(o(a)) in this fash- 
ion (think of a itself, corresponding to f = idg,q)), we first define 


D(f(a)o) = Cc (b)H = spanth(ayy | h € C-(o(a)), y € A}, (B.586) 


and an operator f(a)o : D(f(a)o) + H may once again be defined by (B.573); once 
again, the whole point is that although f may well be unbounded, h/ and hence fh 
lie in C.(o(a)), so that (fh) (a) is defined by (B.566), and hence eventually by the 
continuous functional calculus for the bounded self-adjoint operator b. 
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As in the remark following Theorem B.158, from the point of view of multi- 
plier algebras, eq. (B.573) extends the (nondegenerate) homomorphism C,(o(a)) > 
B(H) to the algebra C(o(a)) of unbounded multipliers on C.(o(a)). 

This is not the end of the construction, since f(a) is typically not closed on the 
domain (B.586). However, it is a very near miss, since f(a) is closable, cf. §B.13. 
To prove that the operator f(a)o in B.573 is closable, we use the second criterion in 
Lemma B.74. For g,h € C.(o(a)) and y, @ € H we may compute: 


(3(4)9, F(a)of(a)W) = (9,8(4)" flajoh(a)y) = (9, (8° fh)(a)w); — (B.587) 
(sf) @PA@Y) = (9. (8f )la)'s(@Y) = (9,(s'fhylay). —— (B.588) 


Hence D(f(a)j) must contain D(f(a)o), and on the latter we may put 


Ff (a)os(@)e = (sf )(a)@, (B.589) 


as in (B.573). In particular, D(f(a)}) is dense in H, so that f(a)o is closable. Fur- 
thermore, if f* = f, then f(a)o is symmetric, i.e., f(a)o C f(a)§. Hence the closure 


f(a) =f(@)o :PF(@) > 4A, (B.590) 


is the operator we are looking for, where D(f(a)) consists of all yw € H for which 
there exists a sequence (Y,,) in D(f(a)o) such that y,, > wand f(a)oW, converges, 
upon which Lemma B.74 gives 


f(a) y =lim f(a)oYn. (B.591) 
What’s more, if f* = f, then f(a)o is essentially self-adjoint, i.e., 


f(a)o =f a)o, (B.592) 


which (by taking the adjoint) is equivalent to the property we will actually prove: 


f(a)* = f(a). (B.593) 
Theorem B.159. For real-valued f € C(o(a)), the operator f(a) is self-adjoint. 
The proof of self-adjointness relies on Nelson’s Lemma: 


Lemma B.160. Let c C c* be densely defined and symmetric. Then c is essentially 
self-adjoint if there exists a continuous unitary representationt > u; of R on H such 
that u, : D(c) + D(c) for each t € R, and 
du; - U4+sY-uW 
=I 
dt 5 0 S 


=icuw, we D(c). (B.594) 


This lemma is closely related to Stone’s Theorem; see Theorem 5.73 in §5.12. 


Proof. The proof of Nelson’s lemma relies on the following variation of Lemma 
5.74 in §5.12, proved by applying the latter (or rather its proof) to the closure of a: 
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Lemma B.161. Let a be symmetric. Then a is essentially self-adjoint (a** = a") iff 
ran(a+i) =ran(a—i) =H. (B.595) 


Applying Lemma B.161 in the same way as Lemma 5.74 is used in the proof of 
self-adjointness of the generator a in Theorem 5.73, yields Lemma B.160. 
For Theorem B.159, with c = f(a)o for some f € C(o(a),R), informally define 


uy = exp(itf(a)), (B.596) 
and formally define u, as the closure of the bounded operator 
(ur)o = e(a) (B.597) 


defined by the bounded function e’ (x) = exp(itf(x)) on o(T), cf. (B.573). The ver- 
ification that t ++ u; defines a continuous one-parameter group of unitary operators 
on H is practically the same as in our proof of part 1 of Stone’s Theorem, and the 
proof of (B.594) is almost the same as a similar step in the proof of part 3 of that the- 
orem, so we will not repeat these here. Therefore, Lemma B.160 applies, showing 
that f(a)o is essentially self-adjoint. 


As an important special case of our continuous functional calculus, we have 
idg(q)(a) =a, (B.598) 


just as in the bounded case. Writing ao for the operator (idg(a))o(@), eq. (B.573) 
gives aop = ag for @ € D(ao), cf. (B.586). Let y € D(ap ), so that there is a se- 
quence (¥,,) in D(ag) such that yw, > y and (ao) converges. Since a is closed, 
it follows that apW, = aWn — ay, so that y € D(a). Hence aj C a. Since both 
operators are self-adjoint, this implies aj = a, which proves (B.598). The proof of 
(B.577) is similar but easier, since (a—z)~! is bounded. 

In similar vein, we may set up a functional calculus for bounded Borel functions 
of a. If f € B(o(a)), then foy~! € B(a(b)), so that (f oy~!)(b) is defined, cf. 
Theorem B.102, and we may define f(a) by (B.566). As in the continuous case, this 
map f +>» f(a) yields a homomorphism 4(o(a)) > B(A), satisfying (B.322). 

What is still missing, however, is the von Neumann algebra W*(a) in which this 
homomorphism takes values. To close this section, we solve this issue. 

If c € B(A) and a is possibly unbounded, we say that (by convention): 


[a,c] =0 iff ca Cac, (B.599) 


that is, if c- D(a) C D(a) and cay = acy for each y € D(a). We write {a}’ for 
the set of all c € B(H) that commute with a. If a* = a, looking at the graph of a 
(and using the fact that a is closed), it is easy to see that {a}’ is a strongly closed 
unital *-subalgebra in B(H). Therefore, by the bicommutant theorem, {a}’ is a von 
Neumann algebra. Its commutant W* (a), defined in the usual sense (B.318), i.e., 
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W* (a) = {a}". (B.600) 
Theorem B.162. Let a be a (possibly unbounded) self-adjoint operator on H. Then 
W*(a) =W*(b), (B.601) 


where b is the bounded transform (B.510) of a. Consequently, if f © B(o(a)) and 
the operator f(a) is defined by (B.566) and Theorem B.102, then f(a) © W*(b). 


Proof. We will prove a more general result of independent interest. 


Definition B.163. A closed unbounded operator a: D(a) — H is affiliated to a von 
Neumann algebra A C B(H), written anA, iff |a,c] =0 for each c € A’. 


For example, if a* =a, then anW* (a), and if an B for some B = B”, then W*(a) CB. 


Proposition B.164. Let A C B(H) be a von Neumann algebra and assume a is a 
self-adjoint operator on H with bounded transform b. Then an iff b € A. 


Proof. The first step consists in the observation that anA iff [a,u] = 0 (or, equiva- 
lently, uau* = a) merely for each unitary u € A’. To see this, we strengthen Lemma 
B.145 (in which we replace a by c): if c € A’, then c is a linear combination of at 
most four unitaries in A’. Indeed, the unitaries uw in the proof are constructed via the 
continuous functional calculus of Theorem B.94, and hence they lie in C*(c) C A’. 

The second step is to show that [a, u] = 0 iff [b,u] = 0 for any unitary u. This is a 
simple computation: if vau* = a, then, looking at the domains in question, 


u(1y+a?)—u* = (17 +a’)"!; (B.602) 
u((1p +a?)~!)!/2u* = (1g ta2)1) 2, (B.603) 


from which ubu* = b with b defined by (B.510). Similarly, if bu = ub, then uau* =a, 
where a is defined by (B.511). Theorem B.152 therefore yields the claim. 


Theorem B.162 now follows: taking A = W*(a), so that aA, yields b € W*(a), and 
hence W*(b) C W*(a). On the other hand, taking A = W*(b), in which case b € A, 
gives anW*(b), and hence W*(a) C W*(b). This yields (B.601), from which the 
final claim follows by our definition (B.566) and Theorem B.102. 


Using this language, it can be shown that for possibly unbounded Borel functions f 
on o(a), the possibly unbounded operator f(a) is affiliated to W*(a). Furthermore, 
there exists a Borel measure 1 on O(a) such that the map f +> f(a) may also be 
seen as a so-called essential homomorphism from 4(o(a))/-V (o(a)) into the *- 
algebra of normal operators affiliated with W*(a), where -V(o(a)) is the set of 
L-null functions on o(a); this means that the algebraic properties hold after closure. 
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Notes 


The history of functional analysis is described from various points of view by 
Bermkopf (1966, 1967), Birkhoff & Kreyszig (1984), Brezis & Browder (1998), 
Dieudonné (1981), Monna (1973), Pier (2001), Pietsch (2007), Siegmund-Schultze 
(2003), and Steen (1973). Apart from von Neumann (1932), the other founding 
books of functional analysis—coincidentally from the same year, which closed the 
foundational era that began around 1900—were Banach (1932) and Stone (1932). 

The concept of a Hilbert space eventually emerged from Hilbert’s work on 
quadratic forms in infinitely many variables (see especially his fourth paper on the 
subject, Hilbert, 1906), which in turn was inspired by his analysis of integral equa- 
tions (Hilbert, 1912). From a modern point of view, Hilbert’s space was the unit ball 
in (?(N); he did not adopt the perspective of linear spaces and operators. 

An important step towards this perspective was what is now called the Riesz— 
Fischer Theorem from 1907; Riesz (1907a) proved the isomorphism 


L’(a,b]) = ?(N), (B.604) 


whereas Fischer (1907) proved the completeness of L?({a,b]) and obtained Riesz’s 
isomorphism as a corollary. Riesz (1907b) also obtained the the Riesz—Fréchet The- 
orem for the special case L*([a,b])), independently found also by Fréchet (1907). 
In fact, Hilbert (1906) had already shown this (mutatis mutandis) for what we now 
call (N); the general case had to wait for Riesz (1934) and Lowig (1934). The 
latter was the first to study non-separable Hilbert spaces, including Corollary B.64. 
Both Riesz and Fréchet in addition played major roles in establishing another fa- 
mous duality theorem, namely the one on the representation of linear functionals on 
continuous functions by measures (cf. Theorem B.15 etc.); see Gray (1984). 

Subsequently, Schmidt (1908) developed the linear and geometric structure of 
(°(N), arguably the first Hilbert space studied as such, and Riesz (1913) explicitly 
studied linear operators on this space. Finally, it was von Neumann (1927ab, 1932) 
who first introduced Hilbert space and operator theory from an abstract point of 
view, i.e., axiomatically. For a historical analysis of this step, which was triggered 
by the attempts of von Neumann (originally jointly with Hilbert and Nordheim) to 
provide a mathematical foundation for quantum mechanics), see Rédei (2005) and 
Duncan & Janssen (2013); also cf. Corry (2004) on the role of Hilbert himself. 

Functional analysis textbooks perused by the author include Conway (2007), 
Dudley (1989), Kadison & Ringrose (1983), Maurin (1972), Reed & Simon (1972), 
Rudin (1973), Schmiidgen (2012), and Weidmann ( 2000). A good place to start for 
contemporary beginners is Rynne & Youngson (2008), followed by the more ad- 
vanced text by MacCluer (2009), which also introduces C*-algebras. A natural next 
step would then be Pedersen (1989), and on to operator algebras! 

Since most of the material in this appendix is standard except for the last three 
sections, it seems pointless to give detailed notes and attributions (so that several 
section even lack notes), except for a few comments on unusual cases, and some 
supplementary material which would have distracted too much from the main text. 
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$B.2. ¢? spaces 
Hélder’s Inequality (which incorporates the claim fg € ¢') should be clear for 
p=lorp=-. For 1 < p<, we use the fact that for any s,t € [0,°°), one has 


t 

sPtla cSt (B.605) 
P 4 

Using (B.605) with s = (|f(x)|/||f||p)? and t = (|g(x)|/|[gl|q)% and summing over x 

gives (B.15). To derive Minkowski’s Inequality for 1 < p < © (the cases p = 1 and 

p = are obvious), define 


h(x) = |f(x) + g(x)?! (B.606) 


Arguing as in part | above, if f € @? and g € @?, then f+g € @? and hence h € (4, 
since h(x)? = |h(x)|? = | f(x) + g(x)|?. Now compute 


If +allb = LIF) +a)? = VAG) |F) +8()| 
< Dlr) F@)| + Vla@)a(e)| = llfAlli + galls 


S [Allg(Ifllo + llslly) = If + gle (Fly + liste), (B.607) 


where in the last inequality we have used (B.15). This immediately gives (B.14). 


§B.4. Basic measure theory Standard textbooks on measure theory include Bo- 
gachev (2006), Dudley (1989), Malliavin (1995), Rudin (1986), etc. 


§B.5. Measure theory on locally compact Hausdorff spaces 

Urysohn’s Lemma states that if X is a locally compact Hausdorff space and K C 
U CX with K compact and U open, then there is a function g € C.(U) such that 
0 < g(x) <1 for each x € X and g(x) = 1 for x € K. Similarly, since a locally 
compact Hausdorff space is completely regular, for each closed set F C X and point 
x ¢ F there is a continuous function such that f(x) = 0 and fi¢ = 0. 

An example of a space that is locally compact Hausdorff but not o-compact, 
given by Rudin ((1986), is X = R* with topology given by the strange metric 
d((x,y), (@,y’)) =1+ |y—y'| ifx Ax’ and d((x,y), 9) = |y—y’|. 

For a (tedious) direct proof of Theorem B.19, see Rudin (1986), Thm. 2.14. Al- 
ternatively, Theorem B.19 may be derived from Choquet theory, as mentioned in the 
main text, or from the Daniell—Stone construction of measures from positive func- 
tionals in a more general setting, see e.g. Bogachev, 2007, 87.8 or Dudley, 1989, 
84.5. For a proof of Theorem B.22 see Malliavin (1995), Thm. 5.3.8. 

The theory of finitely additive measures is exhaustively discussed in Rao & Rao 
(1983); for a summary see Luxemburg (1991). The notion of a semiring of subsets 
of X goes back to von Neumann (1950). See also Loya (2008), including a detailed 
proof that Step(X, @) is a (commutative) algebra. 


§B.6. L? spaces 
An nice result “taming” L?(X,2, 1) is Lusin’s Theorem, assuming is regular: 
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Theorem B.165. Let 1 < p <9. If the support of f € L?(X) has finite measure, then 
for any € > 0 there exists g € C.(X) such that u({x € X | f(x) A g(x)}) <e. 


§B.7. Morphisms and isomorphisms of Banach spaces 

The Baire Category Theorem states that a complete metric space cannot be a 
countable union of nowhere dense sets (where a set in a topological space is called 
nowhere dense if its closure has empty interior, i.e., does not contain a non-empty 
open set). In other words, if (M,d) is complete and M = U,,M,, with each M,, closed, 
then there is at least one n € N for which M,, contains an open ball. 


§B.9. Duality 
The idea of writing (B.136) as limy f has the following origin. 


1. Let f : X — K be any function between any pair of sets, and let F be a filter on 
X. Then f,F, which consists of all B C K for which f—!(B) € F, is a filter on 
K, called the push-forward of F by f. Moreover, if U is an ultrafilter on X, then 
f.U is an ultrafilter on K. This gives a map 


fx : Ultra(X) — Ultra(K). (B.608) 
If we equip Ultra(X ) with the topology generated by all sets of the form 
Us ={U € BX |AcU}, (B.609) 


where A C X, as in the main text, and likewise Ultra(K), then f, is continuous If 
X is discrete, then Ultra(X) = BX, but not otherwise. 

2. We say that some filter F on a topological case X converges to x € X if N, C F, 
where N, is the neighbourhood filter of x, consisting of all neighbourhoods of x. 
This is denoted by limF =x. 

3. Combining points | and 2, if lim f,F = z, i.e., if N, C fF, we write 


lim f =z (€ K). (B.610) 


4. As for sequences, it can be shown that filters on Hausdorff spaces have at most 
one limit, and that ultrafilters on compact spaces have at least one limit. Conse- 
quently, ultrafilters on compact Hausdorff spaces K have exactly one limit, i.e., 
converge to a unique point. This gives a continuous map 


lim : Ultra(K) > K. (B.611) 


5. It follows that if X is any set (seen as a discrete topological space), K is a compact 
Hausdorff space, f : X — K is some function, and U is an ultrafilter on X, then 
f.U has a unique limit z € K, written limy f = z or lim f,U = z, or Bf(U) =z, 
since the latter notation gives the extension Bf in the diagram (B.135). Thus 
Bf =limof,, as in the diagram that combines (B.608) and (B.611), viz. 


lim 


BX =Ultra(xX) 4 Ultra(K) 3 K. (B.612) 
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§B.11. Choquet’s Theorem 

Our proof of Choquet’s Theorem was adapted from Simon (2011) and Ebbe- 
sen (2012). For an extensive treatment of the surrounding Choquet Theory see e.g. 
Alfsen (1970), Bratteli & Robinson (1987), or Phelps (2001). For the Schlafli clas- 
sification see Coxeter (1948). 


$B.12. A précis of infinite-dimensional Hilbert space 
To prove separability of H = L?(IR“), note that a dense subset is given by the set 
of all functions of the form 1 gap, where n € N, B4 = {x ER¢ | ||x||? <r} is the d-ball 


of radius r, and p is some polynomial on R? with rational coefficients. Alternatively, 
take the complex rational linear span of all functions of the form 14, where A C R@ 
is a rectangle with rational coefficients (proving density in either case requires some 
measure theory). The latter construction has the advantage over the former that it 
can be generalized to Hilbert spaces H = L?(X) for which the underlying measure 
space (X,2, 1) satisfies the condition that the space of sets A € Y with (A) < oo 
is separable in the metric d(F,G) = u(F AG), where FAG = (EN F°)U(E° NF) is 
the symmetric difference. Indeed, L”(X) is separable iff this condition is satisfied. 

This class includes the important case where the underlying topological space 
X is Polish (i.e., homeomorphic to a complete separable metric space), Y consists 
of the associated Borel sets, and u is a o-finite regular measure. If, furthermore, U 
is finite, then Lemma B.121 (in its original form for Polish spaces) applies. As in 
the proof of Theorem B.118, this induces Hilbert space isomorphisms like (in the 
second case) L?(X) © L?(0,1), which do not require a choice of basis. See Royden 
(1988), Thm. 15.5.16 and Prop. 15.5.12, and Halmos (1974), p. 177. 


§B.14. Basic spectral theory 

Our terminology “continuous spectrum” o,(a) for the complement of the point 
spectrum 6,(a) is not standard; many authors reserve the former term for the com- 
plement of 6,(a) as well as the so-called residual spectrum o,(a), which is defined 
as the set of those A € o(a) for which A ¢ o,(a) and ran(a— A)” #H. However, 
for self-adjoint operators a (which is all we need in this book, and in quantum me- 
chanics), it follows from e.g. Theorem B.93 that o,(a) = 9, so that at least for a* =a 
“our” continuous spectrum 6,(a) matches with the usual terminology. 

The proof of (B.258) in any Banach algebra A with unit 1, is as follows. We first 
show that the sum is a Cauchy sequence. Indeed, for n > m one has 


n 


ye 


k=m+1 


n 


< ¥ |a@i< Y fil 613) 


k=m+1 k=m+1 


n m 
Leave 
k=0 — -k=0 


For n,m — oo this converges to 0 by the theory of the geometric series. Since A is 
complete, the Cauchy sequence )4_9 a‘ converges for n —> co. Now compute 
n n 
¥ ak(14—a) = ¥ (ak a") = 14-a™"". (B.614) 
k=0 k=0 


Hence 
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= la" I] < fal", (B.615) 


n 
1, —- Y ak(14—a) 
k=0 


which converges to zero when n — », as ||a|| < 1 by assumption. Thus 


n 


. k _ 
pe (14—a) = 1g. (B.616) 
By a similar argument, 
n 
. k — 
ala 2) Sys (B.617) 


so that, by continuity of multiplication in a Banach algebra, one finally has 
ok 1 
ye 2 =(l4—a)". (B.618) 


To see that the closure a~ of a closable operator a is indeed closed (!), suppose 
fn > f and af, — g, with (f,) in D(a”). Since f, € D(a” ) for fixed n, there exists 
(fm) in D(a) such that limm finn = fn and lim, a finn = &n exists. Then clearly 


lim finn = f, (B.619) 
mn 

and we claim that 
limafin,n =. (B.620) 


Namely, ||@fnn — 8|| < |l@fnn — 4fn|| + |lafn — g||. For € > 0, take n so that the 
second term is < €/2. For that n, the vectors a( finn — fn) converge, as m — ©, since 
afmn — 8n and af, is independent of m. Also, recall that finn — fn + 0 as m > %, 
By assumption, a is closable, hence by definition one must have a( Sinn — fn) 20 
in m. Hence we may find m so that ||a finn — afnl| < €/2, so that |lafinn — g|| <€, 
and (B.620) follows. Hence f € D(a7 ). Finally, since a7 f = limm»a@fnyn one has 
a_f =g by (B.620), ora” f = lim, af, by definition of g. Thus a is closed. 
§B.15. The spectral theorem 

By (B.319), von Neumann algebas like W*(a) are complete under strong conver- 
gence of nets (rather than merely sequences), and if some net is monotone increasing 
(or decreasing) and bounded, the strong limit equals the supremum (or infimum), as 
in Proposition B.98. This yields operatorial versions of (B.40) - (B.44): 


ev = sup{ f(a) | fECc(U),0 < f < lo(ay}s (B.621) 
ex = inf{f(a) | f €C(o(a)),0<f < lows fix= Ix}; (B.622) 
ea = infley |UD A,U € O(0(a))}: (B.623) 

= sup{ex | K CA,K € #(o(a))}, (B.624) 


where U € @(o(a)) is open, K € .#(o(a)) is compact, and A C o(a) is Borel. 
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§B.16. Abelian *-algebras in B(H) 
For an alternative proof of Proposition B.106, one observes that 


1 1 
vo [t= [bw = (Vivi ow/ vv) (B.625) 
0 0 


defines a bounded functional on L7(0,1) seen as a dense subspace of L'(0,1), and 
use the duality L'(0,1)* & L*(0, 1). Indeed, using Cauchy-Schwarz, one has 


1 
[Fy] =lev vow Viv < Voll Vvilallv/ Vile = lolli. (8.626 


§B.17. Classification of maximal abelian *-algebras in B() 

Theorem B.118 goes back to von Neumann (1931); for the details of the second 
proof see Kadison & Ringrose (1986), §9.4, or, very lucidly, Stevens (2016). 
§B.20. The trace 

The trace is often neglected in functional analysis books, except when these tend 


to quantum mechanics (Reed & Simon, 1972) or to operator algebras (Pedersen, 
1989). Eqs. (B.476) - (B.477) and (B.496) reflect the function space dualities 


o(N)* & £!(N); (B.627) 
£l(N)* & £°(N); (B.628) 
(N)* = P(N). (B.629) 


Similar to the ¢?-spaces, one has Banach spaces B,(H) residing in Bo(H) for each 
1< p<, called Schatten—von Neumann ideals, see e.g. Simon (2005). 


§B.21. Spectral theory for unbounded self-adjoint operators 

Our approach to unbounded operators via the bounded transform combines ideas 
from Kaufman (1978), Woronowicz (1991), Woronowicz & Napidérkowski (1992), 
Schmiidgen (2012), and Koliha (2014). The proof of Theorem B.159 via Lemma 
B.160 (due to Nelson, 1959), was suggested to the author by Nigel Higson. The last 
part of §B.21 was inspired by Lemma 5.2.8 in Pedersen (1989), in which we have 
simply replaced the Cayley transform by the bounded transform. 

The idea of affiliating closed operators to von Neumann algebra goes back to von 
Neumann; our brief treatment is hopefully more appealing than the elaborate con- 
structions in Kadison & Ringrose (1983), §5.6. A number of details were supplied 
in the M.Sc Thesis of Christian Budde (2015); see also Budde & Landsman (2016). 

For general C*-algebras A, the multiplier algebra consists of all maps m:A— A 
for which there exists an adjoint n = m* : A > A such that b*m(a) = n(b)*a. Such 
maps are automatically linear and bounded, and M(A) is a C*-algebra itself as a 
subalgebra of the Banach space B(A) of all bounded linear maps on A, enriched 
with the adjoint m* =n. See, e.g., Lance (1995), or §C.10 below. For commuta- 
tive C*-algebras this reduces to the definition in the main text, which dates from 
Wang (1961). For unbounded multipliers see Woronowicz (1991) and Lance (1995); 
Woods (1979) treats the bounded case. 


Appendix C 
Operator algebras 


This appendix provides a short course in operator algebras, building on the previous 
appendix. Indeed, there is surprisingly little algebra in the subject (so that there are 
hardly any prerequisites in that direction), and quite a lot of functional analysis, 
involving both operators on Hilbert space and more general Banach space theory. 
Traditionally, the field of operator algebras has had two branches: C*-algebras 
and von Neumann algebras. Although historically speaking the latter (invented by 
von Neumann in 1930) preceded the former (introduced by Gelfand and Naimark 
in 1943), the logical order of presentation is the opposite, since von Neumann al- 
gebras turned out to be special cases of C*-algebras (with additional structure). 
Furthermore, for reasons in the foundations of quantum mechanics (as explained in 
the main text), beside von Neumann algebras we will discuss a few lesser known 
special cases of C*-algebras, such as scattered C*-algebras and AW*-algebras. 


C.1 Basic definitions and examples 


A C*-algebra is both an associative algebra and a Banach space, as follows: 


Definition C.1. 7. A Banach algebra is a Banach space A that is simultaneously 
an algebra in which 
\lab|| < lal ||| (4,5 €A). (C.1) 
2. An involution on an algebra A is a real-linear map *:A — A, writtena- a’, 
such that a** =a, (ab)* = b*a*, and (Aa)* =Aa* for alla,b € Aandi’ € C. An 
algebra with involution is also called a *-algebra. 
3. A C*-algebra is a Banach algebra A with involution in which 


\|a*a|| = lal? (ae A). (C.2) 
With the same proof as (A.22), these axioms imply 
lla"l| = llall- (C.3) 
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The three main examples (at least for a first orientation) are: 


e The space Co(X) of all continuous functions f : X — C that vanish at infinity, 
where X is some locally compact Hausdorff space (see §B.3). This is an algebra 
under pointwise operations: addition is given by (A - f +g)(x) =A f(x) + (x), 
and multiplication is (fg) (x) = f(x)g(x). Furthermore, it has a natural involution 
f* (x) = f(x), and a natural norm || f||..0 = sup,cx{|f (x)|}, ef. (B.23). The above 
axioms of a C*-algebra are easily verified. Note that Co(X ) has a unit (namely the 
function ly equal to 1 for any x) iff X is compact. It is of fundamental importance 
for physics and mathematics that Cy(X) is a commutative C*-algebra. 

e The space B(H) of all bounded operators on some Hilbert space H, with obvious 
algebraic operations, involution given by the adjoint (see (A.15)), and the stan- 
dard operator norm ||a|| = sup{||ay||, yw € H, || y|| = 1}. See Proposition A.7 and 
Theorem B.33 for the proof that B(H) is a C*-algebra; it has a unit, given by the 
identity 1y. If dim(H) > 1, this is a highly non-commutative C*-algebra. 

e The space Bo(H) of all compact operators on some Hilbert space H, with oper- 
ations inherited from B(H); see Theorem B.130, which not merely shows that 
Bo(H) is a C*-algebra, but also that it is a (closed) two-sided ideal in B(H). It 
fails to have a unit whenever H is infinite-dimensional (this follows from almost 
any result in §B.19, such as Theorem B.135). 


Definition C.2. /. A homomorphism between C*-algebras A and B is a linear map 
@:A—B that for alla,b €A satisfies 


p(ab) = 9(a)p(b); (C.4) 
g(a") = pla)’. (C.5) 


2. An isomorphism between two C*-algebras is an invertible homomorphism. If A 
and B are isomorphic as C*-algebras in this sense, we write A = B. 


It follows from linear algebra that the set-theoretic inverse of an invertible linear 
map @:A — Bis automatically linear. It is similarly easy to show that the inverse 
of an invertible homomorphism is itself a homomorphism, but it is a deeper fact 
about C*-algebras that an isomorphism is automatically isometric (and hence has 
an isometric inverse); see Theorem C.62. Furthermore, if B = C, then the property 
(a*) = ¢~(a)* follows from the other conditions on a homomorphism. 

The following notion, originally inspired by quantum mechanics (and turned into 
mathematics by von Neumann), gives a geometric flavor to operator algebras. 


Definition C.3. A state on a C*-algebra A is a bounded linear map @ : A + C that 
Satisfies: 


1. w(a*a) > 0, a€ A (positivity); 
2. ||@|| = 1 (normalization). 


If A has a unit, the definition of a state considerably simplifies. 


Lemma C.4. Let A be a C*-algebra with unit and let @ : A — C be a linear map. 
Then @ is positive iff it is bounded and satisfies ||@|| = @(1a). 
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The proof requires some positivity theory in C*-algebras, so we postpone it to §C.7, 
but as of now, we immediately infer that in the unital case we have: 


Proposition C.5. A linear map @ : A— C ona unital C*-algebra is a state iff @ is 
positive and satisfies @(1,4) = 1, and hence iff @ is bounded with ||@|| = @(14) = 1. 


Using the Banach—Alaoglu Theorem B.48, this implies that the state space S(A) of 
a unital C*-algebra A, i.e., the set of all states on A, is a compact convex subset of A* 
in its w*-topology. Defining the pure state space P(A) of A as the extreme boundary 
0-S(A), the Krein—Milman Theorem B.50 almost immediately implies: 


Theorem C.6. Let A be a C*-algebra with unit, having state space S(A) and pure 
state space P(A) = 0eS(A). Then P(A) £@ and S(A) =co(P(A))~. 


In words, C*-algebras have sufficiently many pure states to approximate general 
states arbitrarily well, at least in the w*-topology (of “expectation values”). 

The only complication in applying Theorem B.50 to K = S(A) C A* is that A is 
a complex Banach space, but the situation may be reduced to the real Banach space 


Aga = {a €A | a* =a}. (C.6) 
Lemma C.7. Let A be a C*-algebra with unit. If @ € S(A), then @(a*) = @(a). 
Proof. Using Definition C.3.2 and eq. (C.2), for any a* = a andt € R we have 
|o(a+it)|? < |la+it||? = ||(a—it)(a+it)|| = lla +07|| < lal? +0. (C7) 


Writing @(a) = @+iB, where a,B CR, this gives @? + B? +2Bt < |lal|? for all t € 
R, which forces B = 0. This proves the claim for self-adjoint a. For the general case, 
one uses the following decomposition of a as a sum of two self-adjoint operators: 


a=b+ic (b* =b,c* =c); (C.8) 
= j(at+a'), c=—sgila—a’). (C.9) 


Consequently, we may restrict a state @ € S(A) to a real-linear functional 
OR = Ma, > Asa > R (C.10) 


that satisfies @(14) = 1 and w(a?) > 0 for any a € Aga, where we used Theorem 
C.52 below to reformulate the positivity condition on states in terms of self-adjoint 
operators alone. Conversely, we may extend a state @p on Aga to a state @ on A by 


(a) = Og(b) +i@p(c), (C.11) 


assuming (C.8) - (C.9). We then have ||@|| = ||@p|| = 1, since obviously ||@p|| < 
||@|| = 1 (since its sup-norm is computed on fewer operators), but also @(14) = 1. 
Thus we may regard S(A) as a compact convex set in the real Banach space A§, rather 
than in the complex Banach space A*, and Theorem B.50 applies. Alternatively, one 
could have extended the latter to the complex case, which is possible with a similar 
(lack of) effort as in the procedure above. 
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C.2 Gelfand isomorphism 


The example A = Co(X) of a commutative C*-algebra given in the previous section 
is more than that; as proved in the very first (1943) paper on C*-algebras by Gelfand 
and Naimark (despite whom one often speaks of , it is generic. 


Theorem C.8. Every commutative C*-algebra A is isomorphic to Co(X) for some 
locally compact Hausdorff space X, which is unique up to homeomorphism. 


The proof is technically intricate at points, but the main idea is quite simple: 


1. The space X may be taken to be the Gelfand spectrum ¥(A) of A, i.e., the set of 
all nonzero linear maps @ : A > C that satisfy @(ab) = w(a)@(b) (and hence 
are homomorphisms A — C as algebras). For example, if A is already given as 
Co(X), then each x € X defines @, € X(A) by @,(f) = f(x), which is linear 
multiplicative (by the pointwise definition of addition and multiplication in A). 

2. The Gelfand transform maps each a € A to a function G@: L(A) > C by 


a(@) = w(a), (a€ A, @ € Z(A)). (C.12) 


3. The Gelfand topology is the weakest topology on Y(A) making all functions 4 
continuous (i.e., the topology generated by the sets @~'(U), U € C open, a € A). 
In this topology, £(A) is compact iff A has a unit, and locally compact otherwise. 

4. The isomorphism A — Co(2(A)), then, is just given by the Gelfand transform. 


This picture becomes even more compelling from the following observation: 


Lemma C.9. For any (i.e. not necessarily commutative) C*-algebra A we have 
(A) C A*. Furthermore, for any @ € X(A), 


ol] =1, (C.13) 
and if A has a unit, 14, then also 
@(14) =1. (C.14) 


In other words, multiplicative linear functionals on A are automatically continuous 
(recall that A* is the Banach space of continuous linear maps from A to C, see §B.9). 

Throughout the rest of this section we restrict all proofs to the unital case; the 
general case may be handled by the technique of unitization to be discussed in §C.6. 


Proof. Let @ € X(A). By multiplicativity, ker(@) is a two-sided ideal in A. Trivially, 
for any a € A, we have a— @(a)- 14 € ker(@). If this element were invertible, then 
ker(@) would contain the unit 14 and hence would coincide with A, contradicting 
the definition of 2(A) (which requires @ to be nonzero). Hence @(a) € o(a). By 
the spectral radius formula (B.255) we have |@(a)| < ||a||, whence @ € A*. 
Furthermore, @(1,4)* = @(1,4), whence @(1,4) = | or 0, the latter being excluded 
since it would imply that @(a) = 0 for all a € A. This gives (C.14) (which also 
follows from Lemma C.4, given Lemma C.11 below), which in turn gives (C.13). 
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The Gelfand topology on L(A) coincides with the weak* topology inherited 
from A*, which is simply the topology of pointwise convergence (i.e. @, — @ iff 
@,,(a) — @(a) for each a € A), and the Gelfand transform a +> @ is (by abuse of 
notation) the image of a in A** under the canonical injection A — A** appearing in 
Proposition B.44, restricted (as a function on A*) to the subset Y (A) C A*. From this 
perspective, continuity of d immediately follows from Proposition B.46. 

This picture of the Gelfand topology also has a technical advantage, for we infer: 


Lemma C.10. [fA is unital, then its Gelfand spectrum (A) is compact Hausdorff. 


Proof. By Lemma C.9, (A) lies in the unit ball of A*, which by the Banach— 
Alaoglu Theorem is compact in its weak* topology. So we are ready if we show 
that 2 (A) is a weak*-closed subset of A*, which is obvious from its definition: if 
@, — @, then for any a € A we obviously have 


(ab) = lim @, (ab) = lim @,(a)@,(b) = @(a)a(b). (C.15) 


We know show that the Hausdorff property of Y(A) is inherited from A*. A subbasis 
of its weak* topology is given by sets of the form 


Uz (@) = {p €A*,|(a) —p(a)| < &}, (C.16) 


where a € A, 9 € A*, and € > 0. Replacing p € A* by p € XY(A) we thus obtain 
a subbasis of the Gelfand topology. If @ and @’ are distinct points in 2(A), there 
exists a € A such that @(a) 4 w'(a). Taking some 0 < € < |@(a) — o'(a)|/2, the 
two points in question are separated by the opens Uf(@) and Uf(a’). 


It is immediate from the definition of 2 (A) that a++ 4 is an algebra homomorphism, 
since we have 


ab(@) = o(ab) = @(a)(b) = a(w)b(@) = (4-6) (a). (C.17) 
The fact that the Gelfand transform preserves the involution follows from: 


Lemma C.11. if @ € ¥(A), then @(a*) = w(a), and hence a* = (4)*. 


Proof. Using (C.14) and (C.2), the proof is the same as for Lemma C.7. 


The hard part of the proof of Theorem C.8 is isometricity of the Gelfand transform: 
I|a||-0 = [lal]. (C.18) 


As always, isometricity obviously implies injectivity. Surprisingly, using the Stone— 
Weierstrass Theorem B.51, in this case isometricity also yields surjectivity of the 
map a++ . Namely, if we take X = ¥(A), and B to be the image A of A under the 
Gelfand transform, then the conditions on B in Theorem B.51 are easily verified. As- 
suming (C.18), this image is obviously closed, so that A = C(Z(A)). With injectivity 
also implied by (C.18), it follows that the Gelfand transform is an isomorphism. 
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It remains to prove (C.18), which conceptually is a conjunction of two equalities: 


|||] = r(a); (C.19) 
lla|| = (a) (a =a), (C.20) 


where r(a) = sup{|A|,a € o(a)} is the spectral radius of a, see Theorem B.84. 
These immediately yield (C.18) for self-adjoint a, from which the general case fol- 
lows from (C.2), noting that a*a is self-adjoint for any a: asuming (C.19) - (C.20) 
as well as the homomorphism property of the Gelfand transform, we compute 


a2 Ak pres 2 
I|4||o = ||@°4l]o0 = |la*al|o = ||a°al| = |lall”. (C.21) 


Since (C.20) just repeats (B.257), we already know it is true for general C*-algebras 
(so far, with unit). As we shall now show, (C.19) holds in any commutative Banach 
algebra with unit. The key is the following lemma. 


Lemma C.12. Let A be a commutative Banach algebra with unit and let a € A. For 
any A € o(a) there is an element @ € X(A) such that A = @(a). 


Granted this, and using the proof of Lemma C.9 as well as (B.253), we obtain 
o(a) =0(4), (C.22) 


for any a € A. Given (B.254), this yields (C.19) and hence the Gelfand isomorphism. 
There are two approaches to our crucial Lemma C.12, each having its own merits. 

The first and best known proof, going back to Gelfand himself, relies on the theory 

of (maximal) ideals in Banach algebras. It is based on the following identification: 


Proposition C.13. Let A be a commutative Banach algebra with unit. There is a 
bijective correspondence between Z(A) and the set M(A) of maximal ideals in A, 


@ + ker(@). (C.23) 


This will be proved in §C.8 below, which also contains the relevant background. 

It implies Lemma C.12, as follows: if A € o(a), then by definition a—A is not 
invertible in A, so that J = {(a—A)b|b € A} is an ideal in A. By Zorn’s Lemma (or 
Hausdorff’s Maximality Theorem), applied to the partially ordered set of all proper 
ideals in A that contain J, ordered by inclusion), J is contained in some maximal 
ideal, so that J C ker(@) for some @ € (A). Since a— A € J (take b = 1,), from 
(C.14) we obtain @(a) = A. Note the non-constructive nature of this argument! 


The other line of proof, due to Kadison, uses a different characterization of Y (A): 


Proposition C.14. Let A be a commutative C*-algebra with unit. Then the Gelfand 
spectrum £(A) coincides with the pure state space P(A). 


Recall Definition 1.10 and Theorem C.6; the pure state space P(A) = 0.S(A) of a 
C*-algebra A is defined as the boundary of the state space of A. The argument that 
instantly delivers Lemma C.12 from Proposition C.14, then, is as follows: 
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Proposition C.15. Let A be a C*-algebra with unit. For any normal element a € A 
(i.e., aa* = a*a) and i € o(a), there is a pure state @ € P(A) such that @(a) =A. 


The proof of both results uses some positivity theory for C*-algebras, which is 
systematically developed in §C.7 below. Here, we just need that a € A is positive, 
written a > 0, iff a = b*b for some b € A, iff a is self-adjoint with o(a) C [0,°). 

We write a > b or b < aif a—bis positive. Also, a linear functional @ : A > C is 
called positive iff @(a) > 0 for alla > 0, and we write @ > g org <@if@—g>0. 

Let us note that the proofs of these results in §C.7 use some Gelfand theory, but 
this use is limited to Theorem C.25, which could have been proved a la Theorem 
C.24, whose proof derives the Gelfand isomorphism in the special case at hand. 
Therefore, the use of Propositions C.14 and C.15 in the proof of (C.18) and hence 
of Theorem C.8 does not render this line of proof of the latter circular. 

In particular, the proof of Proposition C. 14 relies on: 


Lemma C.16. If a* =a €A there is a number t > 0 such that ta > 0. 


Proof. Since o(a) C R is compact (see Corollary C.27 and Theorem B.84), we have 
o(a) C [-t,t] for some t > 0. It is clear from the definition of o(a) that o(ta) = 
t+ o(a), which yields the lemma by the criterion for positivity just stated. 


We now prove Proposition C.14. 


Proof. It is clear from Lemma C.11 and eq. (C.14) that @ € X(A) is a state. To show 
that @ is pure, we use the fact that for any state @ € S(A), the expression 


(b,a) = @(b*a) (C.24) 


defines an hermitian form on A; the easy proof again uses use Lemma C.11. Apply- 
ing Cauchy—Schwarz with b ~~» 1,4 and using 14 = 14 = i gives 


\@(a)|? < w(a*a). (C.25) 


Now suppose that @ = A@, + (1—A)@» with @; € S(A) and A € (0,1). Applying 
(C.25) (in the opposite direction) to @ and @ gives 


o(a*a) > A\o,(a)|? + (1—A)|@2(a)/?. (C.26) 
On the other hand, multiplicativity of @ gives 
co(a*a) = A* |e, (a)? +A(1 —A) (1 (a)@n(a) + @2 (a) (a)) + (1-A)*|@n(a)|?. 


Subtracting this from (C.26) gives the inequality 0 > A(1—A)|@; (a) — @(a)|?, so 
that @; = @, and hence @ is pure by definition. This shows that 2(A) C P(A). 
To prove the converse inclusion, we need another lemma. 


Lemma C.17. Let @ € P(A) bea pure state on A. If t: A C is a linear functional 
such that 0 < t < @, then we can find a scalar s € [0,1] such that tT = sa. 
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Proof. We assume tT # 0 and t # @ (otherwise the claim is trivially true). By 
Lemma C.16, this implies t(14) #0 and t(1,4) # 1. For if t(14) = 0, then for 
a* =awe findt as in Lemma C. 16, so that ta > 0 and hence 0 < t(tta) =+17(a). 
Hence t(a) = 0 on each self-adjoint a, which forces t = 0 by the usual decompo- 
sition (C.8). If t(14) = 1, we apply a similar argument to the positive functional 
@ —T. Therefore, t = 1 — t(1,) satisfies t € (0,1), and defining @ = (@ —T)/t 
and @ = T/t(14) we obtain a decomposition @ = t@, + (1 —t)@. Since @ is 
pure, this gives @; = @ = @ and hence Tt = T(1,)q. Clearly, 0 < t < @ enforces 
0 < t(14) <1, so the claim follows with s = t(1,4). 


We now prove that @ € P(A) is multiplicative on arbitrary a € A, and b € A such 
that (for the moment) 0 < b < ly. Define @ : A — C by @,(a) = @(ab). Then 
0 < @ < a: taking b = c*c, the first inequality 0 < @, follows from 


@);(a*a) = @(c*ca*a) = @((ac)*ac) > 0, (C.27) 


since A is abelian, and the second is analogous, using the fact that O < b < 1,4 implies 
0< 14—b< 1a. Therefore, Lemma C.17 gives @p) = s@ with s = @p(14) = @(b). 

For general 0 4 b > 0, we rewrite b as b = ||b|| - (b/||D||), and use linearity of @ 
and the previous result to obtain multiplicativity. For general self-adjoint b we use 
Lemma C.53, and finally we use (C.8). 


At last, we are now in a position to prove Proposition C.15, so let a € A be normal. 


Proof. Let C*(a) be the commutative C*-algebra generated by a (and hence a*) and 
1, within A; as in Theorem C.25 below, this is the norm-closure of all polynomials in 
a and a*, and C(o(a)) = C*(a) via the map f(A,) > f(a,a*). Using Proposition 
C.14, define a pure state @, on C*(a) by linear and multiplicative extension of 
@, (14) = 1, @, (a) =A, and w, (a*) =A, ie., @ (f(a,a*)) = f(A,A). 

Since ||@ || = 1, Hahn—Banach (Corollary B.41, with V ~ A and W ~» C*(a)) 
yields a linear extension wo, >A —C of @,, which is in fact a state by Lemma 
C.4. To show that @), may be chosen to be pure also on A, let $,(A) C S(A) be the 
set of all states on A that extend @,. This is a nonempty weak*-closed and hence 
weak*-compact convex subset of S(A), which by the Krein—Milman Theorem B.50 
has nonempty boundary 0,5) (A). It is easy to show that 0.54 (A) C deS(A) = P(A): 
for @ € 0,8, (A), suppose @ =t@, + (1 —1)@, witht € (0,1) and @; € S(A). Since 
O\c*(q) = Oy is pure, we have @j)\c*(q) = @r\c*(a) = Oy, OF @; € Sy (A). But @ was 
assumed pure in S, (A), so that @| = @ = @, i.e., @ € 0.S(A). Hence if we choose 
@ € 02S, (A), then the extension @ of @, is also pure on A. 


The following ingredients are still missing from the proof of Theorem C.8: 


e The proof the uniqueness of X up to homeomorphism (see §C.3). 
e The proof of Proposition C.13 (see §C.8). 
e The extension of the entire argument to the non-unital case (see §C.6). 


We start with the first issue, which we fill in more broadly than needed for the proof 
of Theorem C.8, namely, as part of a broader picture called Gelfand duality (which 
will fall into place if one uses the language of category theory, see Appendix E). 
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C.3 Gelfand duality 


Theorem C.8 is a consequence of the following two propositions. 


Proposition C.18. Let A and B be unital commutative C*-algebras. Then 
P= a", (C.28) 


where 0*(@) = @0 4d, establishes a bijective correspondence between unital homo- 
morphisms &: A — B and continuous maps @ : X(B) > X(A). 
In particular, L(A) and £(B) are homeomorphic iff A and B are isomorpic. 
Proof. Since o(ab) = a(a)a(b), if @ € X(B) it is clear that then a*(@) € X(A). 
Conversely, denoting the pertinent Gelfand transforms by G4 : A — C(2(A)) and 
Gz: A—C(Z(B)), given @: X(B) > L(A), we define a : A > B by 


a = Gz! og*oGa, (C.29) 
where ~* : C(Z(A)) + C(Z(B)) is the pullback of @ (i.e., p*(f) = fog). 


It is easy to verify that given @, the map & defined in (C.29) returns @ through 
(C.28), whereas given a, the map @ defined in (C.28) returns @ through (C.29). 


Proposition C.19. For any compact Hausdorff space X, the evaluation map 


ev: X + L(C(X)); (C.30) 
evi(f) = f(x), (C.31) 

is a homeomorphism, so that 
X(C(X)) =X. (C.32) 


Proof. Injectivity of ev immediately follows from Urysohn’s lemma (which applies 
because a compact Hausdorff space is normal), which implies that C(X) separates 
points on X (i.e., for all x 4 y there is an f € C(X) for which f(x) 4 f(y)). 

To prove surjectivity, suppose there is @ € Y(C(X)) such that @ # ev, for all x € 
X. Now ker(@) = ker(ev,) would imply @ = ev, (because @(f) = A then implies 
f —4’-1x € ker(@), and hence f(x) =A, and vice versa), so ker(@) 4 ker(ev,). 
Since ev, € L(C(x)), and @ € Y(C(x)) by assumption, by Proposition C.13 both 
kernels are maximal ideals in C(X), and hence ker(@) C ker(ev,.) is impossible (and 
so is the opposite inclusion). Therefore, for each x there is a function f, € ker(@) 
for which f,(x) 4 0 (for otherwise f(x) = 0 for all f € ker(@), so that ker(@) C 
ker(ev,)). Redefining f, by a phase if necessary, we may assume that f.(x) > 0, and 
taking the real part of f. if necessary, we may also assume that f is real-valued. 

For each x, the set U, where f, > 0 is open, because f is continuous. This 
gives a covering {Uy}.cx of X, which by compactness has a finite subcovering 
{U,, }n=1,....v- Then define the function 


f=) fas (C.33) 
1 


3 
ll 
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which is strictly positive by construction, so that it is invertible. But ker(@) is an 
ideal, so that, with all f,, € ker(@) (since all f. € ker(@)) also f € ker(@). But 
an ideal containing an invertible element must contain ly and hence coincides with 
C(X), contradicting the fact that ker(@) was maximal. Hence ev is surjective. 
Finally, to prove that ev is a homeomorphism, we equip X with the topology 
induced by ev, in which the open sets are of the form ev-'(U), with U open in 
2 (C(X)) in the Gelfand topology. We claim that this new topology on X is weaker 
than the original one (this terminology includes the possibility that the two topolo- 
gies in question coincide). Namely, for f € C(X) one has foev=f. Therefore, since 
the Gelfand topology on ©(C(X)) is the weakest topology for which all Gelfand 
transforms f are continuous, the new topology on X is the weakest topology for 
which all f are continuous. But f was already continuous with respect to the given 
topology, so the claim follows. Without proof we now state a result from topology: 


Lemma C.20. If a set X is Hausdorff in some topology @\(X) and compact in a 
topology G(X), and if O\(X) C G(X), then G\(X) = G2(X). 


Since X is in fact compact and Hausdorff in both topologies, we conclude from this 
lemma that the new topology on X must coincide with the original one. 


Uniqueness of the Gelfand spectrum up to homeomorphism follows from Proposi- 
tions C.18 and C.19: if A is a unital commutative C*-algebra for which A = C(X) 
as well as A = C(Y), then applying © and using (C.32) makes X and Y both home- 
omorphic to £(A), and hence to each other. 


With minor changes, the proof of Proposition C.19: applies also to “well- 
behaved” manifolds, by which we mean second countable smooth locally compact 
Hausdorff manifolds. These are the ones encountered in physics (especially in clas- 
sical mechanics); we need this for Theorem 3.10 in the main text. Such manifolds 
admit partitions of unity subordinate to any given cover (U; ) that are locally finite 
as well as countable, i.e., sequences of smooth functions 7, :— [0,1] such that: 


1. Each x € X has an open neighbourhood U that intersects only finitely many of 
the sets supp(%n); 

2. For each x € X we have ),, ¥n(x) = 1 (where the sum is finite); 

3. Each set supp(%,) is contained in some Uj. 


Furthermore, 2(C”(X)) is defined as for any complex associative algebra A, i.e., 
as the set of nonzero multiplicative linear maps @ : C°(X) > C. 


Proposition C.21. For any second countable smooth locally compact Hausdorff 
manifold X, the evaluation map ev : X — X(C*(X)) in (C.31) is a bijection. 


Proof. Since X is not necessarily compact, we cannot use Urysohn’s Lemma di- 
rectly to prove that C*(X) separates points of X (so that ev is injective), but this time, 
if U CX is open and F C U is closed, there exists a smooth function x : X — [0,1] 
such that 7 = 1 on F and ¥ =0 on X\U. Indeed, {U,X\F} is an open cover of X, 
and if (Zu, Xx\F) is a partition of unity subordinate to this cover, ¥ = Yy will do. 
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Now for x 4 y, take F = {x} and use the Hausdorff property to separate (x,y) by 
disjoint open sets (U,V), and we have ¥(x) = | whilst %(y) =0. 

The proof of surjectivity is the same as for C(X), including the proof that ker(@) 
is a maximal ideal in C*(X), until the point (C.33) is reached. Here compactness is 
no longer available, so that we need to replace (C.33) by the expression 


f = Vcntnfins (C.34) 


where (Zp) is a smooth partition of unity subordinate to the cover (U;), for each 
n€N, f,, is picked by no. 3 in the list of properties of a partition of unity listed 
above, and the coefficients c,, are chosen so that 0 < cn < (n?||Xnfr, lo) | (note that 
Xn and hence 7, f;,, has compact support and is continuous, so that it is bounded). 
Since Y,,(1/n”) < ©, the insertion of the c, makes f bounded and the sum (C.34) 
uniformly convergent. which is necessary to pull @ through the sum so as to prove 
that f € ker(@), as follows. Since the sup-norm is not defined on all of C*(X), we 
need a little argument here. Take ¢ > || ||... so that t- 1 + f nowhere vanishes and 
hence is invertible, so that w(t-ly +f) =t+a(f) £0 by multiplicativity of f, 
i.e., ta@(f) #t. Since f and hence @(f) is real, this gives |@(f)| < ||f||... Since 
(f;, ) =0, and similarly for each finite sum in (C.34), we finally obtain 


|o(f)| = a ; (C.35) 


N 
olf — 3 CnXnfin) 
n=1 


N 
1 is yi CnXnfry) 
n=1 


so letting N — © gives w(f) =0, or f € ker(@). Since f is invertible, this implies 
ly € ker(@) and hence ker(@) = C(X), contradicting w £ 0. 


Corollary C.22. Let X and Y be compact Hausdorff spaces. Then a(f) = fo@, i.e., 
a=", (C.36) 


establishes a canonical bijective correspondence between unital homomorphisms 
a:C(Y) > C(X) (as C*-algebras) and continuous maps @ : X — Y. In particular, 
C(X) and C(Y¥) are isomorphic iff X and Y are homeomorphic. 

Likewise, X and Y are second countable smooth locally compact Hausdorff man- 
ifolds, eq. (C.36) gives a canonical bijective correspondence between homomor- 
phisms a: C*(Y) + C*(X) (as commutative algebras) and smooth maps ~ : X — Y. 
In particular, C*(X) and C*(Y) are isomorphic iff X and Y are diffeomorphic. 


Proof. The passage from @ to @ is obvious. We write evy : X — X(C(X)) and 
evy : Y > X(C(Y)) for the bijections previously just called ev. Since these maps are 
invertible by the previous proposition, we may define a map 9: X — Y by 


Q= evy! 0 Q* oevy, (C.37) 


where a* : X(C(X)) > X(C(Y)) is defined by a*(@) = wo; this lies in Y(C(Y)), 
because @ is linear and a(fg) = a(f)a(g). Eq. (C.36) then holds by construction. 
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e In the compact case, we still need to prove that @ is continuous. To do so, note 
that a compact Hausdorff space Y is completely regular, and as such a subbase for 
its topology is given by sets of the form U = f—!(U'), where f © C(Y) and U’ € 
O(C). Hence g~!(U) = (g* f)~!(U'), and since we know that o*f = a(f) € 
C(X), we conclude that p~!(U) is open in X. Thus @ is continuous. 

e Similarly, in the manifold case, a map @ : X — Y is smooth iff g* f € C*(X) for 
each f € C”(Y); using localization by bump functions 4 la the first part of the 
proof of Proposition C.21, it is enough to prove this for open sets X C R” and 
Y CR", so that p(y) = (o!(y),...,@”(y)). Knowing that @*f is smooth for each 
f €C*(Y), we simply take f(x!,...,x”) = x* to be the k’th coordinate function. 
This declares each o* to be smooth, and therewith also @ itself. 


We now state Gelfand duality, explaining its categorical interpretation in 8E.1. 


Theorem C.23. J. If X is a compact Hausdorff space, then C(X) is a unital commu- 
tative C*-algebra. A continuous map @ : X — Y induces a unital homomorphism 
C(f) = @* : C(Y) > C(X), which behaves well under composition, in that: 


e If @ is the identity, then so is C(@). 
e If w:Y > Z is another continuous map, then C(9 0 W) =C(w) oC(@). 


2. IfA is a unital commutative C*-algebra, then £(A) is a compact Hausdorff space. 
A unital homomorphism a: A — B induces a continuous function L(a) = a* : 
2X (B) + L(A), which behaves well under composition in a similar way: 


e If & is the identity, then so is X(a). 
e If B : B— Cis another unital homomorphism, then L(B oa) = Z(a)oL(B). 


3. There are canonical homeomorphisms and isomorphisms: 
evy :X + Z(C(X)); (C.38) 
Ga :A>C(Z(A)), (C.39) 


with the following “naturality” properties: 


e IfLoC(@):L(C(X)) > L(C(Y)) is the map induced by p : X — Y, then 
2 oC(@) oevy =evy og; (C.40) 
e IfCoL(a):C(ZX(A)) > C(L(B)) is the map induced by a: A — B, then 


CoE(a)oGa =Gpoa. (C.41) 


Proof. The proof is an assembly of previous results and routine verifications. 


In the language of category theory, Theorem C.23 states that the categories CH of 
compact Hausdorff spaces (with continuous functions as arrows) and CCA, of com- 
mutative unital C*-algebras (with unital homomorphisms as arrows, cf. Definition 
C.2) are dual (i.e., contravariantly equivalent). In particular, we have an adjunction 
between the functors C : CH > CCA, and © : CCA; > CH. 
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C.4 Gelfand isomorphism and spectral theory 


As an example of Gelfand’s theory, Theorem 4.3 may be reformulated as follows: 


Theorem C.24. Let H be a Hilbert space, and let a= a* € B(H)sa, with associated 
(commutative) C*-algebra C*(a) generated by a and 1y. The Gelfand spectrum 
2X (C*(a)) of C* (a) is homeomorphic to o(a), under the mutually inverse maps 


Z(C*(a)) — o(a), 0+ (a); (C.42) 
o(a) —+ Z(C*(a)), A a: f(a) > f(A). (C.43) 


In particular, the image of the map @ + @(a) from © (C*(a)) to C is o(a), and the 
isomorphism C*(a) + C(o(a)), f(a) > f, of Theorem B.94 is obtained by com- 


posing the Gelfand transform f(a) fla) ) from C*(a) to C(Z(C*(a))) with the 
isomorphism C(Z(C*(a))) —¥ C(o(a)) obtained by pulling back the map (C.43). 


Proof. First, we note that map (C.43) is well defined. Indeed, it follows from 
(B.289) that the map @, : C*(a) > C is linear for any A € o(a), whilst the fol- 
lowing computation, which uses (B.290), implies that @, multiplicative: 


 (f(a)8(a)) = On (F8(a)) = (Fa)(A) = F(A )8(A) = @a (F(a) @a (g(a). (C.44) 


Injectivity of the map A ++ @, holds because o(a) is Hausdorff, so that f(A’) = 
f(A) for each f € C(o(a)) implies A’ = A. Surjectivity follows from (B.253), since 


Oc+(a)(f(4)) = Oc(o(a)) (fF) = Ran(f), (C.45) 


where we used invariance of the spectrum under isomorphisms. Consider the func- 
tion f(x) =x, so that f(a) =a. It follows from (C.43) that @, (a) = A. Conversely, 
using the same function f, for given @ € ©(C*(a)) we find (a) = ®, so that the 
maps in (C.42) - (C.43) are mutually inverse. It is clear from (C.42) - (C.43) dat 
@®,, —> @, in the Gelfand topology on £(C*(a)) (which is the topology of pointwise 
convergence) iff f(A;) > f(A) for each f € C(o(a)), which is the case iff A; > A 
on o(a). Hence both of our maps (C*(a)) < o(a) are continuous. 
The final claim is a definition chase, using the computation 


oe 


f(a)(@,) = O, (F(a) = FA). 


If dim(H) < 0, one may replace this proof by using the fact that o(a) consists of 
the eigenvalues of a. If p is a polynomial, then @ € 2 (C*(a)) must satisfy @(p(a)) = 
p(@(a)). The characteristic polynomial p; of a, i.e., pe(x) = TTL| (Ai —x), where the 
A; are the n = dim(H) eigenvalues of a (including repetitions), satisfies p-(a) = 0, 
so that @(p-(a)) = 0, i-e., T]_) (Ai — @(a)) = 0, and hence @(a) = A; for some i, 
or @(a) € O(a). Thus (C.42) is well defined. In the opposite direction, eqs. (A.53) - 
(A.55) show that (C.43) is also well defined, in that indeed @, € X(C*(a)). 
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The construction of C*(a) as a C*-algebra within B(H) may trivially be gener- 
alized to arbitrary unital C*-algebras A, i.e., if a ¢ A, we define C*(a) as the C*- 
algebra generated ( within A) by a and the unit 14. If a = a’*, then C*(a) still equals 
the norm-closure of the algebra of all polynomials in a, and hence C*(a) is once 
again commutative. Defining the spectrum o(a) as in Definition B.81, we then have 
the following generalization of Theorem C.24: 


Theorem C.25. Let A be a unital C*-algebra and let a* =a € A. Then 


X(C*(a)) = o(a), 4 (a); (C.46) 
C'(a) = C(o(a)), fla) of, (C.47) 


as spaces and as (commutative) C*-algebras, respectively. Under the Gelfand iso- 
morphism (C.47), the Gelfand transform 4 of a € C* (a) is the identity idg(q):A > A, 


whereas the Gelfand transform 1, of 14 € C* (a) is the unit 1g(q):A > 1. 


This continuous functional calculus may be proved in exactly the same way as 
Theorems B.94 and C.24, with B(H) ~» A. However, these proofs did not invoke 
Gelfand’s Theorem (but rather derived it in the special case at hand), so it may give 
additional insight in the situation if we reprove Theorem C.25 from Theorem C.8. 


Proof. We now assume the isomorphism C*(a) = C(Z(C*(a))) via the Gelfand 
transform. According to (C.22) and (B.253), which imply o(@) = ran(@), the func- 
tion @: L(C*(a)) > C is surjective onto the spectrum o(a) C C. We now prove 
injectivity. If @;,@2 € X(C*(a)) and @ (a) = @(a), then, for all n € N, we have 


@ (a") = @;(a)" = @2(a)" = @(a"), (C.48) 


Since also @(14) = @2(14) = 1, we conclude by linearity that @; = @ on all 
polynomials in a. By continuity (cf. Lemma C.9) this implies that @; = @, since 
by definition the linear span of all polynomials is dense in C*(a). Using (C.12), we 
have therefore proved that 4(@,) = G(@2) implies @| = @)y, i.e., a is injective. 
Since d € C(X(C*(a))) by Theorem C.8, 4 is continuous. To prove continuity of 
the inverse, recall that d: 2(C*(a)) — o(a) is the map @ +> @(a), so that for A € 
o(a), the functional @~'(A) € £(C*(a)) maps a to A. By multiplicativity, a~!(A) 
then maps a” to A”. Hence ny linearity and (C.14), for polynomials p in a one has 


a@'(A): p(a) pla). (C.49) 
Since polynomials are continuous, if A, + A in o(a), then p(A,) > p(A), so 


(a7! (An))(p) + (a7! (A))(p). (C.50) 


Since such polynomials p(a) are dense in C*(a) by definition, and functionals in 
x (C*(a)), being continuous, are therefore determined by their values on polynomi- 
als, we conclude that 4~'(A,) > a~'(A) pointwise. Since the Gelfand topology is 
the topology of pointwise convergence, we conclude that @~! is continuous, so that 
a is ahomeomorphism. This proves (C.46). 
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Finally, for compact Hausdorff spaces X and Y, a homeomorphism 9 : X — Y 
induces an isomorphism @* : C(Y) > C(X) of C*-algebras, where 9(f) = fo@ (cf. 
§C.3). Theorem C.8 and (C.46) give (C.47). Unfolding the latter isomorphism gives 


C*(a) = C(Z(C*(a))) Ca C(o(a)), (C.51) 


where GT is the Gelfand transform and (4~')* is the pullback of the homeomor- 
phism a~! : o(a) + £(C*(a)), as in p* above. Following these arrows and using 
(C.49), one obtains the last claim. 


Corollary C.26. Let A be a unital C*-algebra and let a* =a € A, with spectrum 
o(a). For each selfadjoint element a € A and each f € C(o(a)), there is an operator 
f(a) €A, which is the obvious expression when f is a polynomial (and in general is 
given via the uniform approximation of f by polynomials), such that 


IF (@)I = [flees (C.52) 
o(f(a)) = f(o(a)). (C.53) 


Eq. (C.53) is called the spectral mapping property. Furthermore, the norm and 
spectrum of a as an element of A coincide with the norm and spectrum of a in 


C*(a). 


Proof. We write (C.51) in the opposite direction, 1.e., 


Clo(a)) P22" eer(c*(@)) 23 Ca). (C.54) 


Indeed, if f € C(Z(C*(a))) is the image of f € C(o(a)) under the first arrow, then 


Sets 


f(@) = f(@(a)), and the second arrow says that f(a) = f. Together these give 
f(@(a)) = @(f(a)), which by multiplicativity, linearity, and (C.14), is the case for 
polynomials f = p; the general case follows from the polynomial case by continuity. 

Eq. (C.52) follows from (C.18) and the fact that also the first arrow in (C.54) is 
an isometry, and (C.53) follows from (C.22), with with a ~» f(a). 

To close, take f = idgyq); then (C.52) gives |la||4 = r(a), cf. (B.257), whilst 
(C.18) gives ||a||c*(g) = 1r(a), too. Finally, (C.47) and (B.253) show that the spec- 
trum of a in C*(a) is o(a), which by definition is its spectrum in A. 


Corollary C.27. If a* =a, then o(a) CR. 


By Corollary C.26, we may take the spectrum of a in C*(a). By Lemma C.11, the 
Gelfand transform @ is real-valued. Then use the last part of Theorem C.25. 


Corollary C.28. The norm in a C*-algebra is unique (given all other structure). 
Using (B.257) for a = a*, and then (C.2), for arbitrary a € A we find 


lla|| = V/r(a*a). (C.55) 


Since the spectrum (and hence the spectral radius r) is determined by the algebraic 
structure, (C.55) shows that the norm is determined by the algebraic structure. 
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C.5 C*-algebras without unit: general theory 


In classical physics, non-compact phase spaces are described by commutative C*- 
algebras without unit. Proper ideals in C*-algebras necessarily lack a unit, too. To 
set the stage, we first assume that A is a Banach algebra, and form the vector space 


A=A0C, (C.56) 
and turn this into an algebra in the obvious way, i.e., by means of 
(at+A-1,)(b+p-14) =ab+Ab+patAu- lj, (C.57) 


where we have written a+/-1, for (a,A), etc. This turns the number 1 in C into a 
unit 1, for A, and this is the point: A is unital, even if A lacks a unit. Defining 


Ja+A-14l] =llall +]A], (C.58) 
we also have a norm on A, with ||1 {|| = 1. Using (C.1), (C.57), and (C.58), we have 


(a +A Ta)(+ HLA) S Tell ell TAT eI + [et lell + 1A] Le 
= |la+A-Vallo+e- lal, 
so that A is a Banach algebra with unit. Since by (C.58) the norm of a € A in A 
coincides with the norm of a+0-1, inA@C, we have shown the following: 


Proposition C.29. For every Banach algebra (with or without unit) there exists a 
unital Banach algebra A, called the unitization of A, and an isometric (hence in- 
jective) morphism A — A, such that A/A = C. 


If A is a C*-algebra, (C.58) fails to be a C*-norm with respect to the involution 
(a+A-1,)* =a +A-li, (C.59) 


since (C.2) is not satisfied. Instead, the correct norm in which A @ C is a unital 
C*-algebra is the one borrowed from B(A), i.e., the Banach space of bounded linear 
maps from A to A (regarded as a Banach space), relying on an embedding A C B(A): 


Proposition C.30. Let A be a C*-algebra (with or without unit). 
1. The map L: A B(A) , a+ Lg, given by 


La(b) = ab (C.60) 


establishes an isometric isomorphism between A and L(A) C B(A). 
2. When A has no unit, define a norm on A=A®C by 


Ja+A- Ll] = [La +> layayll, (C.61) 


where the right-hand side uses the operator norm in B(A). With the operations 
(C.57) and (C.59), the norm (C.61) turns A into a C*-algebra with unit. 
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Proof. By (C.1) we have ||Lgb|| = ||ab|| < |lal| ||b|| for all b, so that ||Za|| < |lal]. 
On the other hand, using (C.2) and (A.22), assuming a 4 0, we can write 


Ila|| = llaa*||/llal| = | 


a* 
lier < [Lal (C62) 
lal 


Hence 
I|Zal| = lal] (C.63) 


Being isometric, the map LZ must be injective; it is clearly a homomorphism, so that 
we have proved the first claim of the proposition. 

It is clear from (C.57) and (C.59) that the mapa+A-1, ++ La +A- I pcp) is a 
homomorphism. Hence the norm (C.61) satisfies (C.1), for this is satisfied in the 
Banach algebra B(A). In order to prove that the norm (C.61) satisfies (C.2), we 
note that if an involution on a Banach algebra A satisfies ||a||> < ||a*al|, then A is 
a C*-algebra, because substituting a ~» a* gives ||a*||? < ||aa*|| < |Ja||||a*||, ie., 
lja*|| < llall, so that |ja*al] < ljal|? and hence |lal|? = |ja*al). 

Thus it suffices to show that for each a € A and A € C we have 


Lat AAI? S Lat A-1y)\ Lat AL) (C.64) 
To prove (C.64), we note that by definition of the norm in B(A), for given T € 
B(A) and € > 0, there exists a b € A, with ||b|] = 1, such that ||7||? —e < ||T(b)||?. 
Applying this with T =L,+A-1,, we infer that for every € > 0 one has 
[La +A 14 ||? —€ < ||(La +A + 1, )b||? = |lab + Ab||? = ||(ab +Ab)* (ab + Ab)||. 
Here we used (C.2) in A. Using (C.60), the right-hand side may be rearranged as 
[LoL grag Lava,Pll S [Lorll Lat 4-1i)\LatA-1y)il|lell. (C65) 


Since ||Zp«|| = ||b*|| = ||b|| = 1 by (C.63) and (A.22), and ||b|| = 1 also in the last 
term, the inequality (C.64) follows by letting e > 0. 


Hence the C*-algebraic version of Theorem C.29, slightly supplemented, is: 


Theorem C.31. For every C*-algebra A, there is a unique unital C*-algebra A and 
an isometric (hence injective) morphism A — A, such that A/A ~ C. Moreover, any 
homomorphism a : A — B extends to a unital homomorphism & : A + B by 


&(a+A-1;) = aa) +A- 1p. (C.66) 


Proof, Uniqueness of A follows from Corollary C.28; the rest is obvious. 
This is very important, if only for the following reason: 


Definition C.32. Let A be a C*-algebra without unit. Then the spectrum o(a) of 
any a € A consists of all A € C for which the operator a— A is not invertible in A. 


Proposition C.33. If A has no unit, then 0 € o(a) for any aéA. 
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Proof, If 0 ¢ o(a), i.e., if a were invertible in A, then a '=b+p- 1,4, for some 
b€Aand uw € C. Then 1; = aa~'! =ab+ a € A. This is a contradiction. 


The spectral theory of compact operators provides a nice illustration of this proposi- 
tion: see Theorem B.136.4. At the commutative end of the operator-algebraic world, 
we have the obvious fact that if X is not compact, no f € Co(X) is invertible. 

The construction of A through (C.56), (C.57), (C.59), and (C.61) also works ver- 
batim if A already has a unit 1,4, in which case the spectrum o(a) of a € A may be 
compared with the spectrum o (4) of its image a = (a,0) in A. 


Lemma C.34. Let A be a C*-algebra with unit, embedded in A. For any a € A, the 
spectrum o(a) in A is related to the spectrum 0 (4) of its image a = (a,0) in A by 


(a) = o(a)U {0}. (C.67) 
This will be important for the proof of the fundamental Theorem C.62 below. 


Proof. Suppose 0 #z € p(a), so that b = (a—z- 1,4) 7! exists and satisfies 
ab — zb = ba—zb = 14. (C.68) 


Then b' = b+z7!- (14 —1,) satisfies ab! — zb' = b'a— zb’ = 14, so that b' = 
(a—z-1,)~! exists in A, and hence z € p(d). Conversely, if 0 A z € p(a) with 
corresponding b’ as before, then we first form b = b’ — gl. (14 — 1,), which sat- 
isfies (C.68) but may not lie in A. If b= b” +B -1,;, where b” € A and B € C, this 
is remedied by redefining b!” = b+ B - (14 — 1,), which lies in A and is inverse to 
a—z-1,. Furthermore, by the proof of Proposition C.33 with a ~ a, we always 
have 0 € o(a). If 0 € o(a), then the above argument gives o(a) = o(a), which is a 
special case of (C.67). If 0 ¢ o(a), then (C.67) follows as it stands. 


To close this section, we intoduce the technique of approximate units, which will 
play a decisive role in the theory of ideals in C*-algebras (see $8C.9). Let us first 
give an example. For any noncompact space X, the C*-algebra Co(X) has no unit 
(the unit would be ly, which does not vanish at infinity because it is constant). There 
is a certain substitute for the absentee unit, though. Taking X = R for simplicity, and 
pick a sequence of functions 1,,, 1 € N, that take the value 1 on [—n,n] and vanish for 
|x| >n-+ 1. It is clear that one does not have 1, + 1g in the sup-norm, but instead 
one has limy-+.0 || 1p — f||o. = 0 for all f € Co(R). More generally, one puts: 


Definition C.35. An approximate unit in a non-unital C*-algebra A indexed by 
some directed set A is a family {1,}je, of selfadjoint elements of A, such that 


all <1, (C.69) 
and, for eacha € A, 


lim ||1,a—a|| = lim |lal, —al| =0. (C.70) 
Aco Aco 
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Here the limit is meant in the sense of convergence of the nets A +> ||1,a—al| and 
A ++ ||al, —al| in R indexed by A (i.e., for each open neighbourhood U of 0 in R 
there is some Ay € A such that ||1,a—al|| € U for all A > Ay, etc.). 


Proposition C.36. Every non-unital C*-algebra A has an approximate unit {1 }i€,- 
When A is separable, one may choose the directed set A countable (i.e. A = N). 


Proof. One takes A to be the set of all finite subsets of A (or, if A is separable, from 
a countable dense subset of A), partially ordered by inclusion. Hence A € A is of 
the form A = {a1,...,a,}, from which we build the element b, = Ya a;. Clearly 
by is selfadjoint, and according to Theorem C.52 and Proposition C.51 one has 
o(b,) C Rt, so that n~!1, +, is invertible in the unitization A of A. Take 


1, =by(n 14 +b,) 1. (C.71) 


Since b, = b} and b, commutes with functions of itself like (n“!1 mI +b,)7, one 
has 13 = 1,. Although (n~'1, +b,)~! is computed in A, so that it is of the form 
c+ 1, (for some c € A and p € C), one has 1, = byc + why, which lies in A. 
Using the continuous functional calculus (i.e. Theorem C.25) with f(t) =t/(n +1) 
on by, one sees from (C.53) and the positivity of by, that o(1,) C [0, 1]. This implies 
(C.69) because of (B.257). Putting c; = 1,4; — aj, a simple computation shows that 


Y cict =n by (n'y +by)?. (C.72) 


U 


We now apply (C.52) with a ~» by and f(t) =n-7t(n-! +1)~?. Since f > 0, and f 
assumes its maximum at t = 1/n, one has sup,cp+ |f(t)| = 1/4n. As o(by) C Rt, 
it follows that || f||.. < 1/4n. Therefore, by (C.52) we have 


In 7by (14 $y) *|| < 1/40, (C.73) 
so that || Y;cic?|| < 1/4n by (C.72). By Lemma C.37 below this implies that 
\|cic¥|| < 1/4n for each i = 1,...,n. Since any a € A sits in some directed subset 


of A with n — o, eq. (C.2) implies 


lim ||1,a—al|? = lim ||(1,a—a)*1,a—al| = lim ||c¥c;|| = 0. (C.74) 
Ao A-00 Ao 


The other equality in (C.70) follows analogously. 


In this proof we used the following lemma. 
Lemma C.37. [fa,b € A* and ||a+b|| <k, then |\a\| <k. 


Proof. We first pass to the unitization A of A. By (C.83) we have a+b < k1j, 
hence 0 < a < k1, —b by linearity of < (see Proposition C.51 below), which also 
implies that kl; —b <k1,, as 0 < b. Hence, using —k1, < 0 (since k > 0), we obtain 
—k1, <a<k1j, from which ||a|| < k by (C.84)). 
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C.6 C*-algebras without unit: commutative case 


We still owe the reader a proof of Theorems C.8 and C.23 for the nonunital case. 

In the commutative case, the unitization procedure has a simple topological 
meaning, which illustrates the general principle that the use of commutative C*- 
algebras often allows one to trade topological properties for algebraic ones. 

The one-point compactification X of a non-compact locally compact topological 
space X is the set X = X Uc, topologized by the open sets in X plus those subsets of 
X Uco whose complement is compact in X. The injection i: X — X is continuous, 
and any continuous function f € Co(X) extends uniquely to a function f € C(X) 
satisfying f(cc) = 0. The space X is the solution (unique up to homeomorphism) 
of a universal problem: if @ : X — Y is a map between locally compact Hausdorff 
spaces such that Y\ f(X) is a point and f is a homeomorphism onto its image, then 
there is a unique homeomorphism y : X —> Y such that @ = yoi. All this is true 
even when X is compact, in which case © is an isolated point of X. 

The unitization of Cy(X) corresponds to the one-point compactification of X: 


Lemma C.38. Let X be a locally compact Hausdorff space. Then Co(X) = C(X). 


Proof, The map cy : Co(X) + C(X) given by cy(f+A-14) = f+A-1y is obviously 
an injective homomorphism. To prove surjectivity, note that any f € C(X) assumes 
the form f = f + f(cc)- ly, where f = f — f(0)- ly is such that fix € Co(X). Thus 
our map is an algebraic isomorphism, which by Theorem C.62 is also isometric. 


Lemma C.39. Let A be a commutative C*-algebra, with unitization A. Then the fol- 
lowing map sa: &(A) —> £(A) between their Gelfand spectra is a homeomorphism: 


1. Each w € E(A) extends to a character @ = s4(@) on A by 
@(a+A1;) = a@(a)+A. (C.75) 
2. The following functional We. = s4(c°) on A is a character of A: 
@o.(a+A14) =A. (C.76) 
3. There are no other characters on A (i.e. except Ws. and @, where w € E(A)). 


Proof, Only the third part is nontrivial: any @! € ¥(A) restricts to Y(A); if this 
restriction is zero, then w’ = @.., and if not, we have w’ = @ with @ = oy. Z(A)* 


We are now in a position to prove Theorem C.8 also in the nonunital case. Ap- 
plying the unital case of Theorem C.8 to A and using Lemma C.39, one finds 


A®C =A~C(E(A)) ZC(E(A)) &CO(E(A)) =CO(E(A@C. CI) 


Keeping track of all isomorphisms, the initial C is duly mapped to the final C (as 
befits an isomorphism of unital C*-algebras), and A is mapped to Co(2(A)). 
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Next, we return to Theorem C.23. If X fails to be compact, the difficulty arises 
that a map @ : X + Y does not, in general, pull back to a morphism @* : Co(Y) > 
Co(X). For example, with Y equal to a point, any f € C(Y) = C pulls back to a 
constant function on X, which does not vanish at infinity. Hence some restriction is 
necessary on the class of allowed maps between locally compact Hausdorff spaces. 


Definition C.40. A map @ : X — Y between locally compact Hausdorff spaces is 
proper when @~!(K) is compact for any compact set K CY. 


Without proof (since this is basic topology), we list some properties of proper maps. 
Lemma C.41. Let @ : X — Y be a map between locally compact Hausdorff spaces. 


1. @ is proper iff it is closed and ~~! (pt) is compact for any point pt € Y. 

2. If X is compact and @ is continuous, then @ is proper. 

3. If Y is compact and X is not, proper maps ® (trivially) do not exist. 

4. If @ is continuous, then 9 is proper iff @ :X — Y, given by @(x) = @(x), x E X, 
and Q(x) = ey, is continuous (which is automatic if X is compact, of course). 

5. The composition of two proper maps is again proper. 


The algebraic (or “noncommutative”) counterpart of a proper map is as follows. 


Definition C.42. A homomorphism a : A > B between C*-algebras is called non- 
degenerate when @(A)B” = B, in other words, if &(A)B (i.e., the linear span of all 
expressions of the form a(a)b, a € A, b € B) is dense in B. 


For example, any unital homomorphism between unital C*-algebras is trivially 
nondegenerate, and conversely, a nondegenerate homomorphism a@ : A — B between 
unital C*-algebras is automatically unital. To see this, it follows from (C.4) - (C.5) 
that e = a@(1,4) is a projection in B (i.e., e? = e* =e), so that @(A)B C eB. Since 
B=ceB®@ (lg —e)B as a vector space, &(A)B and hence eB can only be dense in 
B when e = 1g. Similarly, using an approximate unit in B it is easy to show that 
nondegenerate homomorphisms A — B cannot exist if A is unital but B is not. 

This is a “noncommutative” version of the third part of Lemma C.41 above. 


Lemma C.43. Let @ : X — Y be a continuous proper map between locally compact 
Hausdorff spaces. If f € Co(Y), then f o@ € Co(X), and the corresponding pullback 
* :Co(Y) — Co(X) is a nondegenerate homomorphism of C*-algebras. 


Proof. Let f € Co(Y) and € > 0, giving a compact K C Y such that | f(y)| < € for 
each y ¢ K. Then K’ = g~!(K) C X is compact, and |*f(x)| < € for each x ¢ K’. 
For nondegeneracy, take g € Co(X) and € > 0; these yield a compact set L Cc X 
such that |g(x)| < € for each x ¢ L. Then @(L) C Y is compact, so Urysohn gives us 
f €C-(Y) with 0 < f(y) < 1 for each y € Y and f(y) = 1 for each y € @(L). Then: 


I(@"F) 8 — Ble = suptlf(@e))s@) — g(x)|} < 2e. 


The (commutative) C*-algebraic counterpart of this lemma is as follows: 
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Lemma C.44. Let a: A — B be a nondegenerate homomorphism between com- 
mutative C*-algebras. If @ € £(B), then @0 a € L(A), and the ensuing pullback 
a* : L(B) + L(A) is a continuous proper map between the two Gelfand spectra. 


Proof. Multiplicativity of @ 0 a is clear, as @ is a homomorphism. If @o @ were 
identically zero, then (since @ is not), a@(a) = 0 for each a € A, which contradicts 
the assumption that @ be nondegenerate. Continuity of a@* follows from the fact 
that the Gelfand topology is the topology of pointwise convergence. Finally, in the 
present context, properness of &* is most appropriately derived as follows: 


1. Use (C.66) to pass to a unital homomorphism @ : A + B. 

2. Theorem C.23.2 gives a continuous map (@)* : 5(B) > Z(A). 

3. Lemma C.46 below and continuity of sg and Sa make (a*)’ continuous. 
4. Lemma C.41.4 then proves that o* is proper (and continuous). 


This suggests the following generalization of Theorem C.23: 


Theorem C.45. J. IfX is a locally compact Hausdorff space, then Co(X) is a unital 
commutative C*-algebra. A continuous proper map ~ : Y — X induces a non- 
degenerate homomorphism Co(f) = @* : Co(X) + Co(Y), which behaves well 
under composition (exactly as in Theorem C.23). 

2. If A is a commutative C*-algebra, then £(A) is a locally compact Hausdorff 
space. A nondegenerate homomorphism a@ : A — B induces a continuous proper 
map & (a) = a* : X(B) + L(A), which behaves well under composition, too. 


3. There are canonical homeomorphisms and isomorphisms, 
evy :X > X(Co(X)); (C.78) 
Ga : A Co(Z(A)), (C.79) 


with similar naturalness properties as the corresponding maps in Theorem C.23. 


Categorically speaking, Theorem C.23 thus expanded states that the category LCHp 
of locally compact Hausdorff spaces and proper continuous maps is dual to the 
category CCAn of commutative C*-algebras and nondegenerate homomorphisms. 


Proof. Parts 1 and 2 are Lemmas C.43 and C.44, respectively; correct composition 
of the maps in question is easily checked (as simply as in the unital case). 

Eq. (C.79) has already been proved, cf. (C.77). Similarly, using Proposition C.19 
(with X ~» X) and Lemma C.39 (with A ~» Co(X)), we have 


XU {oo} =X LE(C(X)) YE(Co(X)) & E(Co(X)) =Z(Co(X)) Ue. (C.80) 


Keeping track of the isomorphisms in question, it is easily verified that X and © are 
mapped to Y(Co(X)) and @.., respectively, and this proves (C.78). 
Naturality follows from the unital case (Theorem C.23) and the following lemma: 


Lemma C.46. /. Let @: A — B be a nondegenerate homomorphism between com- 
mutative C*-algebras. Then the following diagram commutes: 


C.6 C*-algebras without unit: commutative case 667 


where s4 and sp are defined in Lemma C.39, & is defined in (C.66), and (a*) = 
for @ = a*:Z(B) — X(A), where the dot is defined as in Lemma C.41.4. 

2. Let p : X —Y be a proper continuous map between locally compact Hausdorff 
spaces. Then the following diagram commutes: 


Oe) Ce) 


|@y [or 
Co(x) — CX), 


where cx and cy are defined in the proof of Lemma C.38, (9*) = @ fora =": 
Co(Y) + Co(X) defined by (C.66), and @ : X + Y is defined in Lemma C.41.4. 


The proof is a diagram chase, but let us note that in clause | the role of nondegen- 
eracy is to ensure that a@* (and hence (a@*)’) is defined in the first place (cf. Lemma 
C.44). Similarly, in clause 2, the properness assumption on @ ensures that @* (and 
hence (@*)’) is defined. Once defined, commutativity of these diagrams is obvious. 

Finally, the property that LCHp is indeed a category is trivial (as the identity maps 
id : X — X are proper), but the corresponding fact for CCAn is not, for we need to 
show that the identity arrows id: A — A are nondegenerate. This comes down to the 
property that A* = A -A is dense in A. In fact, the situation is even better: 


Lemma C.47. In any C*-algebra A one has A? = A (and hence A" = A, n € N). 


Proof. We prove that any self-adjoint a € A takes the form 
a=a,az, (C.81) 


for suitable a;,a2 € A. Since the linear span of such a is A, this proves the lemma. 
We assume A has no unit, for otherwise the claim is trivial. We then embed A C A 
and, for a* = a € A, consider C*(a) c A. We factor the identity function f > f on 
o(a) C Rast = f(t) fo(t) for some f; € C(o(a)), so that by Corollary C.26, we 
have (C.81) for a; = f;(a) € C* (a). By the properties of the map f > f(a) mentioned 
in Corollary C.26, including the fact that f(15(a)) = 14, it follows that if f(a) = b+ 
fu-1, for some b € A and u € C, then f(0) = u; note that 0 € o(a) by Proposition 
C.33. Consequently, imposing the additional condition f;(0) =0 enforces a; € A. 


Corollary C.48. Each nondegenerate homomorphism a : Co(Y) — Co(X) is in- 
duced by a proper continuous map 9: X + Y via a = @*. 


Proof. Given (C.78), the proof is the same as for the compact case, cf. Corollary 
C.22. In particular, @ is given by (C.37), which map is proper because a* is proper 
by Lemma C.44 and evx, and evy are homeomorphisms. 
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C.7 Positivity in C*-algebras 


We now turn to the important notion of positivity. First, we give two examples: 


e An operator a € B(H) ona Hilbert space H is called positive when (w,ay) > 0 
for each y € H. By Proposition B.99, this property is equivalent to a* = a and 
o(a) CR*, or toa =b*d for some b € B(H), or toa =c? for some c = c* € B(H). 

e A function f on some space X is called positive when f(x) > 0 for all x € X. This 
applies, in particular, to elements of the commutative C*-algebra Co(X). 


These examples are not as dissimilar as they might appear at first sight: a € B(H) 
is positive iff its Gelfand transform id,(,) = 4 is positive as a function in C(o(a)); 
cf. Theorem C.24. Hence we have a notion of positivity for certain concrete C*- 
algebras, which we would like to generalize to arbitrary abstract C*-algebras. 


Definition C.49. An element a of a C*-algebra A is called positive when a = a* and 
its spectrum is positive; i.e., O(a) C R*. We write a > 0 when a is positive, and At 
for the set of all positive elements in A. 


The basic structure of A* is captured by the following definition. 


Definition C.50. A convex cone in a real vector space V is a subspace V* such 
that: 


1. IfvEV* andt €R*, thentveEV*. 

2. Ifv,w eV", thnv+twev~. 

3. V+N—-V* = {O}. 

A linear partial ordering in V is a partial ordering < in which v < w implies 
tv <tw for allt € Rt, as wellasv+u<w-+tuforallueéV. 


These structures are equivalent: A convex cone V* C V defines a linear partial or- 
dering < by v< wifw—veV"™, and conversely, < yields Vt ={vEV|0<v}. 
Proposition C.51. The set A* of all positive elements of a C*-algebra A is a convex 


cone in the real vector space Aga, see (C.6). 


Proof, Leta € A‘. Property 1 follows from o(ta) = to(a), which is a special case 
of (B.270). Since o(a) C [0,r(a)], we have |c—A| <c for all A € o(a) and all 
c > r(a). Hence sup cg(a) |¢- Leia) — (A)| < ¢ by (C.22) and Theorem C.24, ie., 
ll: 1g(a) — G||oo < c. Gelfand transforming back to C*(a), by (C.18) this implies 


lc: 1a —al] <c, (C.82) 


for all c > ||a||. Inverting this, one sees that if (C.82) holds for some c > |la||, then 
o(a) CR°. Use this with a~ a+b and c = |\a|| +||b||, soc > |la+]|. Then 


lle-la— (a+) < [(lall-a@)il+lell—2)Il Se, 


where in the last step we used the previous paragraph for a € At and b € At sep- 
arately. As for a, this inequality implies a+b € A*. Finally, when a € AT and 
a € —A* it must be that o(a) = {0}, hence a = 0 by (B.257) and Definition A.1. 
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For example, when a = a* one checks the validity of the important inequalities 
—|lal|-la <a < |lal|-1a, (C.83) 
by taking the Gelfand transform of C*(a). This also yields the implication 
—b<a<b = |lal| < ||dl, (C.84) 


because the antecedent and (C.83) with a ~» b yield —||b||-14 <a <.-||b|| 14, so that 
G(a) C [-|[A], |All, hence |jal| < |\bl| by (B.257) and (B.254). 

We now come to the central result in the theory of positivity in C*-algebras, 
which generalizes the cases A = B(H) and A = Co(X) discussed at the beginning. 


Theorem C.52. With At = {a € A | a > 0} as in Definition C.49, one has 
At = {a’ |a* =a} (C.85) 
= {a*a|a€ A}. (C.86) 


Proof. If o(a) C R* and a = a‘, then \/a € A is defined by Corollary C.26 for 
f = and satisfies /a” =a. Hence A* C {a? | a* =a}. The opposite inclusion 
follows from (C.53) and Corollary C.27. This proves (C.85). 

Towards (C.86), the inclusion At C {a*a|a € A} is trivial from (C.85). 


Lemma C.53. Each selfadjoint element a has a unique decomposition 


gee bs. (C.87) 


where ax,a— € At and a,a_— =0. Moreover, \|ax|| < ||a|| = max{|la||+, ||a||_}. 


Proof. Apply Corollary C.26 with f = idg(4) = f+ — f_, where idg(q)(t) =t and 
f(t) = max{+r,0}. The norm property follows from (C.52). Uniqueness follows 
from the corresponding property in C(o(a)), where it is obvious. 


Apply the lemma to a = b*b (noting that a is selfadjoint). Then 


(a_)> =—a_(a, —a_)a_ =—a_aa_ = —a_b*ba_ =~—(ba_)*ba_. (C88) 


Since o(a_) C Rt because a_ is positive, we see from (C.53) with f(t) =f that 
(a_)? > 0. Hence —(ba_)*ba_ > 0. 


Lemma C.54., If —c*c € At for some c € A thenc =0. 
Proof. We can write c = d + ie, d and e selfadjoint, so that 

c*c = 2d? +20" ~ cc". (C.89) 
Now for any a,b € A one has 


o(ab) U {0} = o(ba) U {0}. (C.90) 
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This is because for z 4 0, invertibility of ab —z implies invertibility of ba—z; indeed, 
(ba—z)~! =z! (b(ab—z)'a— 1a). (C.91) 


Applying (C.90) with a ~ c and b ~» c*, it follows that o(c*c) C R™ implies 
o(cc*) C R~, hence o(—cc*) C R*. By (C.85) and Proposition C.51 (applied to 
Definition C.50.2), eq. (C.89) then implies that c*c > 0, i.e., o(c*c) C R*, so that 
the assumption —c*c € A* now yields o(c*c) = 0. Hence c = 0 by Proposition C.51 
applied to Definition C.50.3. 


By this lemma, the last claim preceding it implies ba_ = 0. As 
(a_)? =—(ba_)*ba_ =0, (C.92) 


we see that (a_)* =0, and finally a_ = 0 by Corollary C.26 with f(r) =r!/3. Hence 
b*b =a, € A™. Thus {a*a|a € A} CA*, which ends the proof of Theorem C.52. 


An important consequence of (C.86) is the fact that inequalities a, < da» for 
selfadjoint a;,a2 are stable under conjugation by arbitrary elements b € A, so that 
a, <a implies b*a,b < b*azb. This is because a; < az is the same as az — a; > 0, 
and hence by (C.86) there is an a3 € A such that az — a; = a4a3. But (a3b)*a3b > 0, 
i.e., b*ab < b* ab. For example, replace a in (C.83) by a*a, and use (C.2), yielding 
a*a < ||a||?14. Applying the above principle gives the operator inequality 


b*a*ab < |l\a||?b*b (a,b € A). (C.93) 
We note that the definition of a state implies that if a < b, then @(a) < @(b), so that 
@(b*a*ab) < |\a\|*@(b*b), (C.94) 


from (C.93). This is a key lemma for the GNS-construction (cf. Theorem C.88). 
At last, we are also in a position to prove the fundamental Lemma C.4. 


Proof. If @ is positive and a* = a, then (C.83) in the form ||a||- 14a > 0 gives 
(a) < ||a||@(14), and hence @(a) € R. For general a € A, eq. (C.8) then im- 
plies @(a*) = @(a) (which may alternatively be proved from Lemma C.53). This, 
in turn, makes the form (C.24) hermitian. Cauchy-Schwarz then gives |@(a)|? < 
@(a*a)@(1,), as in (C.25). Furthermore, if |la|]| <1 then also |la*a|| <1 by 
(C.2), so that (C.83) gives w(a*a) < w(1,4). Combining these inequalities yields 
|@(a)| < @(1,), so @ is bounded with ||@|| < @(1,4); taking a = 1,4 gives equality. 

Conversely, assume that ||@|]| = @(14) = 1. In proving that @(a) > 0 whenever 
a > 0, we may also assume that 0 < a < 14. Then (C.7) shows that @ = @(a) ER. 
Also, we have o(a) C [0,1] and hence o(1,4 — a) C [0,1], which in turns implies 
0 < (14 —a) < Ig, and hence ||14 —al| < 1, cf. (C.84). Then 


1-a<|1—a|=lo(14—a)| <llolll|14 —al| <1, (C.95) 


whence a > 0, and hence @(a) > 0. 
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C.8 Ideals in Banach algebras 


This section returns to general Banach algebras. It has two aims: it completes the 
(first) proof of Theorem C.8, and it prepares for the theory of ideals in C*-algebras. 


Definition C.55. Let A be a Banach algebra. 


e A left ideal (right ideal) in A is a closed linear subspace J for which a € J 
implies ba € J (ab € J) for all b € A. 

e An ideal in A is both a left and a right ideal (i.e., a closed two-sided ideal). 

e A maximal ideal is a proper ideal J C A (i.e., J 4 {0} and J # A) that is not 
properly contained in any larger proper ideal. 


Thus an ideal is closed by definition. However, it is useful to know that if we omit 
the word ‘closed’ throughout Definition C.55, a maximal ideal J C A (defined in the 
purely algebraic sense) is automatically closed. Indeed, note that the closure J of J 
cannot be A, since J does not contain any invertible element of A (otherwise it would 
coincide with A), and the set A, of all invertible elements in A is open (see the proof 
of Theorem B.84). Since J CJ C A and J is maximal, J = J. 

Furthermore, one often uses the fact that an ideal J that contains an invertible 
element a must coincide with A (since a~'a = 1,4 must then lie in J, whence J = A). 

In the commutative case, left and right ideals are the same as ideals. For example, 
if A= C(X) for acompact space X, then each closed subspace Y C X defines an ideal 


C(X3Y) ={f €C(X) | f@) =0Vx EY}. (C.96) 
Note that C(X;Y) is indeed closed by definition of the sup-norm, and that 
C(X;Y) =Co(X\Vr). (C.97) 


Proposition C.83 in §C.11 shows that all ideals in C(X) are of this form. It is not 

necessary to assume that Y is closed, but this assumption entails no loss of general- 

ity, since C(X;Y) =C(X;Y), where Y is the closure of Y. We will see that C(X;Y) 

is maximal iff Y is a point, and that all maximal ideals in C(X) are of this form. 
The next proposition is predicated on an elementary Banach space result: 


Lemma C.56. /f V is a Banach space and W is a closed linear subspace of V, then 
the vector space quotient V /W is a Banach space in the “distance to W” norm 


(|| = inf |v wl], (C98) 


where tT: V — V/W is the canonical projection. Also, ||t(v)|| < ||v|| for any v EV. 
Proof. First, (C.98) is well defined, for if t(v’) = T(v), i.e., v—v’ =w’ € W, then 
|| 7(v’) || = inf{ |v’ — wl], w € W} = inf{||v’ —w—w’'|],we W} 
= inf{||v —w||,w © W} = ||t(v)]]. 
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The axioms for a norm are easily verified, except positive definiteness: we have 
||c(v)|| = 0 iff inf{||v — w||,w © W} =0; hence there must be a sequence (w,) in W 
with v—w, — 0, or w, > v. Since W is closed, v € W, so that t(v) = 0. For the last 
claim, eq. (C.98) yields ||[t(v)]] < ||v —w]] for all w € W; take w =0. 

There seems to be no natural proof of the completeness of V/W, but here is a 
trick: for any Cauchy sequence (t(v;))n in V/W, find a subsequence (T(vn,))« with 
7 (Vigsy) —T(Yn,)|| <2~* for all k. Using induction in k, one finds a sequence (u) in 
V with t(uz) = T(vp,) and ||uz41 —ug|| < 2~*. Hence uz — u (since V is complete), 
and hence T(vp,) —> T(u) by continuity of t. Then also (T(Vn))n — 0. 


Proposition C.57. If J is an ideal in a Banach algebra A, then the quotient A/J is a 
Banach algebra with multiplication 


t(a)t(b) = t(ab). (C.99) 
If A is unital and J is proper, A/J is unital, with unit t(1,) satisfying 
I|7(14)|| = 1. (C.100) 


Proof. As far as the Banach algebra structure is concerned, first note that (C.99) is 
well defined: when jj, j2 € J one has 


T(a+ ji1)t(b+ jr) = t(ab+aj2+ jib+ jij2) = t(ab) = t(a)t(b), — (C.101) 


since aj2+ jib+ j1j2 € J by definition of an ideal, and t(j) = 0 for all j € J. 
To prove (C.1), observe that, by definition of the infimum, for given a € A, for 
each € > 0 there exists a j € J such that 


I|t(a)|| +€ > |la+ Jl). (C.102) 


For if such a j would not exist, then ||t(a)|| < ||a+j|| — € for all 7 € J, violating 
(C.98). On the other hand, for any j € J, it is clear from (C.98) that 


I|t(a)|| = I|t(a+ AI] < lla Jl. (C.103) 
For a,b € A, choose € > O and j1, j2 € J such that (C.102) holds for a, b, and estimate 


IIt(a)t(b)|| = [Ita + A)tb+ fa) = Ita +A) G+ Hp) 
S a+ A)(+ ja)ll S llat+ All o+ fall 
< (It@)ll +e) (lt) +). (C.104) 


Letting € — 0 yields 
IIt(a)t(B)I| < [It(@)II It) II. (C.105) 


If A has a unit, t(1,4) is a unit in A/J, cf. (C.99). By (C.103) with a = 1, and 
j =0 one has ||t(1,4)|| < |]14|] = 1. On the other hand, from (C.105) and (C.99) 
with b = 1, anda € A\J, one derives ||t(1,4)|| > 1. Hence (C.100) follows. 
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In a C*-algebra the last step is unnecessary, since a unit necessarily has norm one. 


In the commutative case, a nice example (with X and Y compact, as above), is 
C(X)/C(X;Y) =C(Y), (C.106) 


as two elements f,g of C(X) are identified in C(X)/C(X;Y) when f — g €C(X;Y), 
i.e., when they coincide on Y. If one looks at C(X;Y) as the kernel of the restriction 
map ry : C(X) — C(Y), then ran(ry) = C(X)/ker(ry ), which is just (C.106). 

We now prove Proposition C.13, which we unfold as: 


1. If @ € L(A), then J = ker(@) is a maximal ideal in A; 
2. M1 = @> iff Jo, = Ja; 
3. Every maximal ideal J is of the form J = Jo, for some @ € L(A). 


For the first claim, Jy is an ideal since @ is multiplicative. To prove maximality, 
suppose J@ CI CA for some ideal J. Then @(J) is an ideal in C, so either w(7) = {0} 
or @(I) = C. In the former case, I = J (since J C kerw = J), in the latter, J = A 
(because for any a € A there is b € J such that @(a) = w(b), whence a—b € kerg 
and hence a— b € J, ora € b+1=1). Thus Jg is maximal. 

For the second, if @;(a) =c, then @(a—c- 1,4) = 0 by (C.14), so if ker(@,) = 
ker(@z), then also @)(a—c-14) =0 and hence @2(a) =c = @1(a). 

Finally, let J be maximal. Since J # A, there is a nonzero b € A, b ¢ J. Form 


Jp ={ba+j\acA,j € J}. (C.107) 


Since A is commutative, J, is an ideal. Taking a = 0 gives J C Jy. Taking a = 1, 
and j = 0 gives b € Jy, so that J, A J. Hence J, = A, as J is maximal. In particular, 
14 € Jp, so that 14 = ba+ j for some a € A, j € J. Applying t: A > A/J gives 


t(1a) = 14 = T(ba) = t(b) (a), (C.108) 


because of (C.99) and t(/J) = 0. Hence t(a) = t(b)~! in A/J. Since b 4 0 was 
arbitrary, this shows that every nonzero element of A/J is invertible. At this point it 
is therefore appropriate to invoke the Gelfand—Mazur Theorem: 


Theorem C.58. /f every nonzero element of a unital commutative Banach algebra 
B is invertible (i.e., if B is simple), then B= C as Banach algebras. 


Proof. Since o(b) 4 0, for each b £ 0 there is A € C for which b— A - 1, is not 
invertible. Hence b — A - 1g = 0 by assumption, and b++> A is an isomorphism. 


Hence there is an isomorphism y : A/J — C, from which we define @: A + C 
by @(a) = w(t(a)). This map is clearly linear (since t and y are), and nonzero 
(because @(14) = 1). Also, @(a)@(b) = @(ab) by (C.99) and the fact that y is a 
homomorphism, so @ € L(A). Finally, since ker(t) = J and y is an isomorphism, 
J =ker(@). This proves claim 3 above, and therefore Proposition C. 13 also follows. 
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Definition C.55 verbatim applies to C*-algebras. One would expect that an ideal in 
a C*-algebra is required to be selfadjoint by definition, but this is unnecessary: 


Proposition C.59. Let J be an ideal in a C*-algebra A. If a € J then a* € J; in other 
words, every ideal in a C*-algebra is automatically selfadjoint. 


The proof (which generalizes a similar argument for compact operators, given at the 
end of §B.18) relies on the theory of approximate units (see §C.5). 


Proof. Let J CA be the given ideal, and put J* = {a*|a € J}. Note that j € J implies 
jE INI*: it lies in J because J is an ideal, hence a left-ideal, and it lies in J* be- 
cause J* is an ideal, hence a right-ideal. Since J is an ideal, J J* is a C*-subalgebra 
of A. Hence by C.36 it has an approximate unit {1, }. Take j € J. Using (C.2), 


i — Fall? =1G- a aG* -s* 1a) 
=| I-F Aa +WalG7F — iF aI, (C.109) 


since 1, = 1,. As we have seen, j* j € JJ", so that, also using (C.69), both terms 
vanish for A — 0. Hence lim,_,.. || /* — j*1,|| = 0. But 1, lies in JTJ*, so certainly 
1, € J, and since J is an ideal it must be that j*1, € J for all A. Hence j* is a norm- 
limit of elements in J; since J is closed, it follows that j* € J. 


We now turn to a C*-algebraic analogue of Proposition C.57, which is of suffi- 
cient importance to promote it to the status of a theorem: 


Theorem C.60. Let J be an ideal in a C*-algebra A. Then A/J is a C*-algebra with 
respect to the norm (C.98), the multiplication (C.99), and the involution 


t(a)* = t(a*). (C.110) 


The proof of this theorem uses approximate units, too. In view of Proposition C.57, 
all we need to prove to establish Theorem C.60 is the property (C.2). This uses: 


Lemma C.61. Let {1,} be an approximate unit for J, and let a € A. Then 


\|c(a)|| = lim |la—al,||. (C.111) 
Aoo 


Proof. It is obvious from (C.98) that 
l]a—alal > [Ir(@)]. (C.112) 
For the opposite inequality, add a unit 1, to A if necessary, pick any j € J, and write 
lla—alall = (a+ AG —1a)+i0a- Dll S lat sl dal + ila — Jl. 
(C.113) 


Note that 
IM—Iall <1, (C.114) 
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by Definition C.35 and the proof of Proposition C.51. The second term on the right- 
hand side goes to zero for A — ©, since j € J. Hence 


lim |la—alq|| < |la+ Jl). (C.115) 
Ao00 


For each € > 0 we can choose j € J so that (C.102) holds. For this specific j, we 
combine (C.112), (C.115), and (C.102) to find 


Jim |Ja—al,||—€ < ||r(a) | < lla—alyll. (C.116) 
— oo 


Letting € + 0 proves (C.111). 


We now prove (C.2) in A/J. Successively using (C.111), (C.2) in A, (C.114), 
(C.111), (C.99), and (C.110), we find 


IIt(@) || = jim ||a—al,||? = lim ||(a—al,)*(a—al,)| 
— oo Ae 
= lim ||(l4— 1, aaa — 13) || < lim |] — 1 || [la"aCa — 1) 
Aoo A> 
S< lim |la’a(1a — 12) || = IIt(a"a)|| = IIt(@) ("| 


= ||t(a)t(a)" |. (C.117) 


As in the proof of Proposition C.30, this implies (C.2), and hence Theorem C.60. 
We now state and prove the key result about morphisms. 


Theorem C.62. Let a@: A — B be a nonzero homomorphism between C*-algebras. 


1. The homomorphism @ is continuous, with norm ||o|| = 1. 

2. Its kernel ker() is an ideal in A. 

3. If & is injective, then it is isometric. 

4. An isomorphism of C*-algebras is automatically isometric. 

5. The range (A) is a C*-subalgebra of B; in particular, o(A) is closed in B. 


Proof. If necessary, we first reduce the proof of the first claim to the case where 
A and B have units and @ is unital: we do so by replacing A and B by A and B, 
respectively (even if A and/or B was already unital in the first place, but @ was not), 
and replacing a@ by the homomorphism @ : A > B defined in (C.66). If we do so, 
it follows from Lemma C.34 that in the worst case the spectrum of a or a(a) is 
modified by adding 0, which does not change the spectral radius. Therefore, the 
move from @ to @ makes no difference to the argument to follow, so we assume 
that 14 € A and 1g € B, and @(14) = Ip. If z € p(a), so that (a—z)~! exists in 
A, then @(a—z) is certainly invertible in B, for (C.4) implies that (a@(a—z))~! 
a((a—z)~'). Hence p(a) C p(a@(a)), so that 


o(a(a)) C o(a). (C.118) 


Replacing a by a*a this gives r(a@(a*a)) < r(a*a), and since a&(a*a) = a(a)*a(a), 
eq. (C.55) yields ||a(a)|| < ||a||, and hence ||@|| < 1. This proves continuity of a. 
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Recalling that ideals in C*-algebras have to be closed by definition, this also 
implies the second claim of the theorem (whose algebraic content is trivial). 

We now prove the third claim of the theorem (which trivially implies the fourth). 
Assume there is b € A for which ||a(b)|| 4 ||b||, so that by o(a) 4 o(a(a)) fora = 
b*b by (C.55). Then (C.118) implies the strict inclusion o(a@(a)) C o(a) (as aclosed 
subset). By Urysohn’s lemma, there is a nonzero function f € C(o(a)) that vanishes 
on o(a@(a)), so that f(a@(a)) = 0 by Corollary C.26. By Lemma C.63 below, this 
implies a(f(a)) = 0. If @ is injective, this contradicts the property f(a) 4 0, which 
follows from f 4 0 and (C.52). Thus @ must be isometric. 

Combining the second claim with Theorem C.60, we see that A/ker(@) is a C*- 
algebra. By the theory of vector spaces, we have a vector space isomorphism 


w:A/ker(a) > a@(A), (C.119) 


so that 
yYot=aQ. (C.120) 


Since @ and Tt are homomorphisms between C*-algebras, so is y. Since y is injec- 
tive, it is isometric, as we have just shown. Hence w(A/ker(a@)) has closed range 
in B. But y(A/ker(@)) = @(A), so that a has closed range in B. Since & is a mor- 
phism, its image is a *-algebra in B, which by the preceding sentence is closed in 
the norm of B. Hence @(A), inheriting all operations in B, is a C*-algebra. 

Finally, we prove that for the projection t : A > A/J in the case at hand we have 


I|7|| = 1. (C.121) 


If A has a unit, this follows from Lemma C.56 with (C.100). If not, the argu- 
ment is similar, using an approximate identity (1,) for A: from (C.105) we obtain 
lim, ||t(1,|| = 1, which with (C.69) gives sup, ||t(1,|| = 1. Since ||t|| < 1 from 
Lemma C.56, this yields (C.121). 

Because y is an isometry, it then follows from (C.120) that ||a@|| = 1. 


Here we used a nice property of the continuous functional calculus (Theorem C.25): 


Lemma C.63. If @ : A > B is a morphism, and a = a*, then 
f(a(a)) = a(fla)) (f € C(o(a))). (C.122) 


Here f(a) and f(a@(a)) are defined through Theorem C.25, cf. (C.118). 


Proof. The property is true for polynomials by (C.4), since for those functions, f(a) 
and f(a(a)) have their naive meaning. The general claim follows by continuity. 


Corollary C.64. Every ideal in a C*-algebra is the kernel of some homomorphism. 


Proof. This follows from Proposition C.59, since J is the kernel of tT: A > A/J, 
where A/J is a C*-algebra and T is a morphism by (C.99), and (C.110). 
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C.10 Hilbert C*-modules and multiplier algebras 


In §8C.5 we explained the minimal way of adding a unit to a C*-algebra that did not 
have one to begin with (although the procedure even works if it does). There is also 
a maximal way, which embeds a non-unital C*-algebra in its multiplier algebra. In 
our view, this maximal extension is actually more elegant and useful than the min- 
imal one, although the commutative case might give the oppositie impression: here 
(as we have seen), the minimal extension corresponds to the simple one-point com- 
pactification of the Gelfand spectrum, whereas the maximal one extends the latter to 
its awesome Cech-Stone compactification. In topology one may doubt if the latter is 
indeed the neater choice, but for many noncommutative C*-algebras the multiplier 
algebra comes naturally. For example, the C*-algebra Bo(H) of compact operators 
on a Hilbert space H is thereby turned into the C*-algebra B(H) of bounded ones. 
There are various ways of defining multiplier algebras. Although not strictly nec- 
essary, we offer the powerful entrance provided by Hilbert C*-modules, which are 
simultaneous generalizations of C*-algebras, Hilbert spaces, and vector bundles. 


Definition C.65. A pre-Hilbert C*-module over a C*-algebra A consists of: 


e A right A-module E, i.e., a complex linear space equipped with a bilinear map 
ExA-—A, written (W,a)++ Wa (where y € E and a € A) such that 


(yb)a = w(ba). (C.123) 


e Amap (,)4:E x E-A, linear in the second entry (the axioms below implying 
antilinearity in the first entry) that for all y,@ € E and b € A, satisfies 


(WP)A = (@,W)as (C.124) 
(W, Pala = (VW, P)aa; (C.125) 
(y,w)4 > 0; (C.126) 
(y,w)4=0 ey=0. (C.127) 


It is useful to note that (C.124) and (C.125) imply that 


(Wa, 0)4 =a" (W,@)a. (C.128) 


Lemma C.66. Jn a pre-Hilbert C*-module E over a C*-algebra A one has: 


(y,0)4(9,W)a < loll? (ve wya; (C.129) 
Cv, @)all < [wll oll (C.130) 
| wal] < || wl] |lal). (C.131) 


in which the following expression (which duly defines a norm on E) occurs: 


yl] = lw, wall. (C.132) 
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Proof. To prove (C.129), we assume 9 # 0 (otherwise, the claim clearly holds), so 
that also ||@|| > 0 by (C.127) and (C.132). Replacing @ by @/||@|| if necessary, (i.e., 
if ||@|| 4 1, it is then enough to show that whenever ||@|| = 1, we have 


(Wi P)A(P, Wa <(WW)a- (C.133) 


To this effect, we substitute 9(@, w)4 — yw for yw in (C.126) and use (C.128), (C.124), 
and (C.125), and (C.93), the latter in form b*cb < ||c||b*b for any b andc > 0 in A. 
This gives (C.129). Eqs. (C.2), (C.124), and (C.129) then imply (C.130). Eq. (C.131) 
follows from (C.128), (C.93), (C.84), and (C.2). 

Finally, (C.132) defines a norm: scaling is clear, positive definiteness follows 
from (C.127), and the triangle inequality is easily derived from (C. 130). 


Corollary C.67. The inner product on a pre-Hilbert C*-module is nondegenerate, 
in that wy = 0 iff (W,@)p =O forall @ EE. 
Proof. It follows from (C.129) that for any y € E, we have 


Il wi] =sup{||(y, all, @ € EZ, ll gl] = 1}. (C.134) 


. We now come to the main definition. 

Definition C.68. A Hilbert C*-module over A is a pre-Hilbert C*-module over A 
that is complete in the norm (C.132). We also say that E is a Hilbert A-module. 
The three most straightforward examples of this concept, written “E = A”, are: 


e C*-algebras themselves: E =A with action (a,b) +> ab and inner product 
(a,b), =a°b. (C.135) 


By (C.2), the norm in E defined by (C.132) coincides with the original norm. 
e Hilbert spaces: E =H and A =C-, acting on H by the given scalar multiplication. 
e Hermitian vector bundles & over locally compact Hausdorff spaces X: here E = 
Co(X,&) consists of the continuous cross-sections y of & vanishing at infinity, 
A =C(X) has natural action on E given by (ya)(x) = a(x) w(x), and the Co(X)- 
valued inner product is given by the hermitian structure < -,- >¢ on each fiber, 


(W, P)c(x) =X < W(x), P(x) >e. . (C.136) 


This implies a norm || y|| = sup{|| (x) ||e,x € X}, where Ilv|lz =< Vv >~@. 


A Hilbert C*-module E = A defines a C*-algebra C*(E,A) that consists of all 
maps a: E — E for which there exists a map a* : E > E such that for all y,@ € E, 


(W,aQ)a = (a YW, )a- (C.137) 


Such maps are called adjointable. For example, if E = A, as in the first example 
above, then any element a € A defines an adjointable map simply by left multipli- 
cation (i.e., a(b) = ab). If A has a unit, then this is it, whereas in the nonunital case 
there are (many) more adjointable maps on A = A. 
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We now show that adjointable maps on a Hilbert C*-module form a C*-algebra. 


Theorem C.69. /. An adjointable map on a Hilbert A-module is automatically C- 
linear, A-linear (that is, (ay)b = a(wh) for all y € E and b € A), and bounded. 

2. The adjoint of an adjointable map is unique, and the map a‘ a* defines an 
involution on the space C*(E,A) of all adjointable maps on E. 

3. Equipped with this involution, and with the usual operator norm on the Banach 
space E, the space C*(E,A) is a C*-algebra. 

4. For eacha € C*(E,A) and w € E, the usual bound ||ay|| < |la|||| w|| sharpens to 


(ay,ay)a <|lall”(y, W)a- (C.138) 


Proof. The property of C-linearity is obvious, whereas A-linearity follows from 
(C.128): this gives (a(wh), ~)4 = (a(w)b,)a, upon which Corollary C.67 yields 
the claim. A similar argument shows that a* € C*(E,A) when a € C*(E,A). 

To prove boundedness, fix y € E and a € C*(E,A), and define Ty : E + A by 
Ty@ = (aay, )a. It is clear from (C.130) that ||Ty|| < ||a*ay||, so that Ty is 
bounded. On the other hand, since a is adjointable, one has Typ = (Y,a*a@),, so 
that, using (C.130) once again, one has ||Ty@|| < ||a*ag|| || ||. Since E is complete 
we may apply the Banach—Steinhaus Theorem B.78, which gives 


sup{||Ty||, ¥ € E, ||w|| =1} <<. (C.139) 


It then follows from (C.132) that ||a|| < oo. Uniqueness and involutivity of the ad- 
joint are proved as for Hilbert spaces; the former follows from (C.127), the latter 
in addition requires (C.124). The space C*(E,A) is norm-closed, since one easily 
verifies from (C.137) and (C.132) that if a, — a, then a, converges to a*. As a 
norm-closed space of linear maps on a Banach space, C*(E,A) is a Banach algebra, 
so that its satisfies (C.1). To check (C.2), one infers from (C.132) and the definition 
(C.137) of the adjoint that ||a||? < ||a*a||; using (C.1) and the argument leading to 
(A.22), one first obtains ||a*|| = ||a||, and subsequently ||a*a|| = ||a*||. 

Finally, it follows from (C.126), (C.86), and (C.137) that for fixed y € E, the 
map at++ (w,aw)a from C*(E,A) to A is positive. Replacing a by a*a in (C.83) and 
using (C.2) and (C.137) then leads to (C.138). 


In our first example the C*-algebra C*(A,A) is usually called the multiplier alge- 
bra, denoted by M(A). If A has a unit, then M(A) =A, but in general M(A) is much 
larger than A, and obviously it always has a unit (given by the unit operator on A). 


Proposition C.70. For any commutative C*-algebra A we have an isomorphism 


M(A) + C,(2(A)); (C.140) 
+ a, (C.141) 


where, in terms of the Gelfand isomorphisms A = Co(Z(A)), f + f, we have 


a(f) = 4f. (C.142) 
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In particular, for any locally compact space X we have an isomorphism 

M(Co(X)) = C,(X), (C.143) 
where a € C;(X) simply acts on f € Co(X) by a(f) =af. 
Proof. If A is commutative, then by Theorem C.69.1, any a € M(A) satisfies 


a(fg) =a(f)g = fa(g), fg €A. (C.144) 


For any f,g € A and @ € Y(A) such that @(f) 40 and w(g) £0, the second equality 
in (C.144) gives w(a(f))/@(f) = @(a(g))/@(g). Since @ £0, there is at least one 
f €A for which @(f) 4 0, so that the function 4: 2(A) > C given by 


.._ @(a(f)) _ a(f)(@) 
a(@) = of) Fo) (C.145) 


is well defined. Thus (C.142) holds by construction. Since a(f) € A, continuity of 
the Gelfand transform makes 4 continuous. Next, we estimate 


|a(@) f(@)| = |a(f)(@)| < llafyleo = la) < HallILfll. (C.146) 
where we used (C.145) and isometry of the Gelfand transform, cf. (C.18). Hence 
a(o) f(@ 2) lla|| 
= <i ' (C.147) 
MO= Fa) | * Fo 


for any f € A, and @ € X(A) for which @(f) 4 0 and || f|| = 1. For those, we have 


inf{|f(@)|"' | @ € £(A),@(f) £0,||fl| = 1} = 
(sup{|f(@)| | @ € L(A), o(f) £0,|Fll =1})-* = Illa’ =1, (C148) 


again using || f|| = || f||... Together with (C.147), this gives |@(@)| < ||a||, and hence 


I|a||.. < |lall. (C.149) 


In particular, @ is bounded, so that the map (C.140) - (C.141) is well defined. This 
map has an inverse, as clearly any function @ € C,(Z(A)) defines an element of 
M(Co(2(A))) by multiplication, and hence defines an element a € M(A) by the 
inverse Gelfand transform, cf. (C.142). 


Since an isomorphism of C*-algebras is isometric, we have ||@||.. = ||a||. This may 
also be proved directly from (C.149) and the converse inequality 


llal| = supf{lla(f)I| | F A, [fl = 1} = sup{lla(f) [lo | fF EA, [flee = 1} 
= sup{||@fll | f € A, ||flle = < l4|l-- (C.150) 
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Most of this argument also works for the pre-Hilbert Co(X) module E = C,(X) 
(whose completion is Co(X), of course), except for the inequality (C.149), which 
relies on boundedness of a (cf. Theorem C.69). This is lost if F fails to be complete, 
and we now merely obtain an isomorphism of algebras with involution: 


M(C,(X)) &C(X). (C.151) 


For a slightly different take on this, for a general C*-algebra A we define an un- 
bounded multiplier on A (seen as a Hilbert A-module) as a closed C-linear and 
A-linear map m: D(m) — A, where D(m) is a dense right-ideal in A (in the algebraic 
sense, 1.e., by exception we do not require an ideal to be closed). In general, the set 
UM(A) of all unbounded multipliers on A has little algebraic structure (like the set 
of all closed operators on a Hilbert space), but in the commutative case we have 


UM(Co(X)) & C(X), (C.152) 


under the same identification as in (C.143). This means that any unbounded multi- 
plier on Co(X) takes the form g++ fg for some f € C(X), with domain 


D(f) = {g € Co(X) | fg € Co(X)}- (C453) 


The argument is the same as in the proof of Proposition C.70 (except for bounded- 

ness), adding that fact that C.(X) is a core for each f, in that its closure (defined as 

usual by the set of all g € Co(X) for which there is a sequence (g,,) in C.(X) such 

that g, + g and fg, is Cauchy) is given by D(f); then fg, > fg (in the sup-norm). 
Let us return to the bounded case, concentrating on the multiplier algebra 


M(A) =C*(A,A). (C.154) 


Proposition C.71. There is an inclusion A — M(A), where A (seen as a subspace 
of B(A)) acts on A (seen as a Hilbert A-module) by left multiplication. Moreover, A 
is an essential ideal in M(A), in having nonzero intersection with any other ideal. 


Proof. We first note that each map Ly : b+ ab (a,b € A) is adjointable, because 
(¢,La(b))a = (c,ab)a = cab = (a°c)"b = (ac, b)a = (Lar (c),b)a, 


so that the adjoint of L, is Ly. Furthermore, Lz = 0 iff a = 0, as can be seen by 
taking an approximate unit in A, or from Lemma C.47. Hence A C M(A), which is a 
proper inclusion iff A has no unit (since M(A) always has one, i.e. the unit of B(A)). 

Now let m € M(A) and a € A. Then (moa)(b) = m(ab) = m(a)b, since m € 
C*(A,A) is A-linear. Hence ma = moa € A, since m(a) € A. Since am = (m*a*)*, 
this argument shows that also am € A, making A an ideal in M(A). 

To see that this ideal is essential, we note (as a little exercise) that an ideal J C B 
in a C*-algebra B is essential iff bJ = 0 (i.e., bj = 0 for each j € J and some b € B) 
implies b = 0. Again by Lemma C.47, if m(ja) = 0 for each j € A, a € A, and some 
b € M(A), then b(c) = 0 for each c € A, and hence c = 0. 
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In general, one may compute M(A) as follows. If A and B are C*-algebras and 
E is a Hilbert A-module, we say that a homomorphism @ : B + C*(E,A) is nonde- 
generate if a(B)E~ = E, that is, if the closed linear span of all vectors of the type 
a(b)y, where b € Band y € E, equals E. It can be shown (from the Cohen—Hewitt 
factorization theorem) that in this case one needs neither the linear span nor the 
closure to recover E, in that each each element of E literally factorizes: 


E={a(b)yw|beEB,weE}. (C.155) 
Theorem C.72. Suppose A and B are C*-algebras, E is a Hilbert A-module, and 
a:B—+C*(E,A) 


is a nondegenerate homomorphism. If B is an ideal in a C*-algebra C, then & has a 
unique extension to C (which is injective if B is essential in C and @ is injective). 


Proof. The idea is easy: write 9 € E as p = a(b)y for some b € Band we E, cf. 
(C.155), and define the desired extension 


&:C4C*(E,A) (C.156) 


by 
&(c)o = a(cb)y, (C.157) 


provided this is well defined (in which case & is clearly uniquely determined by @). 
Adjointability then also follows, since we may define &(c)* = &(c*), and compute 


(Ge(c)"a(b')y',a(b)W) a = (a(c*b')y', a(b)W)e = (y',a(c*b')"a(b) y)p 


(yw, a(b')*a(cb)W) 
= (a(b')w’, &(c)a(b)y)s. (C.158) 


Furthermore, it is easy to see that & is a homomorphism. Also, a@(c) = 0 for c € C 
implies a(cb) = 0 for each b € B; if & is injective, then cb = 0, and if B is an 
essential ideal in C, then c = 0, so that & is injective. 

To show that (C.157) is independent of the representatives b and y, we estimate 


||O-(c)oe(b) y|| = lim | or(cea b)yl| = lim ||oe(cea )or(b) | 
< lim||or(ce )]I|]0¢(6) yl < lim ||ee, |||] 0¢(6) y| 
= lela) yl, (C.159) 


where (e,) is an approximate unit in C. In particular, if @(b)y = a(b’)w’, then 
(cab) y = &(c)a(b!)y". 


This proof works also without (C.155); one then has a finite sum @ = );; a(b;) Wj, 
and a computation similar one to the previous one shows that &(c) is bounded on 
the dense subspace of EF consisting of such sums. 
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This theorem (with B ~» A and E ~~ A) explains in which sense M(A) is a max- 
imal unitization of A (whereas A is a minimal one): all we need to do is abstractly 
define a unitization of a non-unital C*-algebra A as a unital C*-algebra containing 
A as an essential ideal (cf. Proposition C.71). This incorporates both A and M(A), 
each being distinguished by a universal property it satisfies, namely: 


Corollary C.73. For each unital C*-algebra C containing A as an essential ideal, 
there are unique injective homomorphisms: C + M(A) and A — C whose restriction 
to A is the identity map. In other words, denoting the inclusion of A into C by 1, we 
have commutative diagrams 


A—>A A ——> M(A) 
C 


The topological counterpart of this corollary is the construction of the one-point 
compactification X and of the Cech-Stone compactification BX, respectively; cf. 
Lemma C.38, which we may now supplement by simply defining BX as the Gelfand 
spectrum of the commutative C*-algebra C,(X) =~ C(BX). In this analogy, the con- 
dition on an ideal B C C to be essential simply corresponds to a non-compact space 
X being a dense subspace of some compactification of it. 


Corollary C.74. Let E be some Hilbert A-module E and let a: B— C*(E,A) be an 
injective nondegenerate homomorphism. The unique extension G@: M(B) > C*(E,A) 
of & that exists according to Theorem C.72 maps M(B) isomorphically onto 


Zo(E) ={a€C*(E,A) | aa(b) € a(B),a(b)a € a(B) Vb € B}. (C.160) 


Proof. Note that Z(E) is essential in C*(E,A), as easily follows from the nonde- 
generacy of a. Therefore, by the argument just given (plus the abstract nonsense 
that shows that universal objects are unique up to isomorphism), we only need to 
prove that Zq(E) is a maximal unitization of B. Let B be an essential ideal in C 
and consider the injective extension a : C + C*(E,A) of @ given by Theorem C.72. 
Then & maps C into Zq(E) by construction, as &(c)a(b) = a(bc) € a(B), etc. 


Corollary C.75. A nondegenerate homomorphism a : B — M(A) has a unique ex- 
tension to a homomorphism & : M(B) — M(A). 


Proof. Take C = M(B) and E =A in Theorem C.72.1. 


Note that two nondegenerate homomorphisms @ : A > M(B) and B : B > M(C) 
can be composed into a nondegenerate homomorphism B o a : A + M(C), which 
by definition equals 8 o &. Thus one obtains a category CAm whose objects are C*- 
algebras and whose arrows are nondegenerate homomorphism a : A —> M(B), witha 
full subcategory CCAm whose objects are commutative C*-algebras (with the same 
arrows). This leads to a neat extension of Gelfand duality (cf. Theorem C.45): 
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Theorem C.76. The category LCH of locally compact Hausdorff spaces and contin- 
uous maps is dual to the category CCAm of commutative C*-algebras just defined. 


This claim may be unfolded as in Theorem C.45, omitting ‘proper’ on the topologi- 
cal side and replacing @ : A —> B on the algebraic side by a: A M(B). 


Proof. First, a continuous map @ : Y — X trivially induces a nondegenerate ho- 
momorphism @* : Co(X) + C;(Y). Second, since @ € X(B) defines a nondegen- 
erate homomorphism B — C, by Theorem C.72 it extends to a homomorphism 
@ : M(B) > C. Thus the pullback a* : L(B)  X(A) of a nondegenerate homo- 
morphism @ : A + M(B) is well defined (and still continuous). Part 3 of Theorem 
C.45 stays the same, and the pertinent naturality properties are easily verified. 


Corollary C.77. A nondegenerate homomorphism a : Co(X) — B(H) has a unique 
extension to a homomorphism & : Cy(X) > B(A). 


Proof. Taking A = C, E = H, and B = Bo(H), Theorem C.72.2 gives 


M(Bo(H)) & B(H). (C.161) 


Combine this with the previous corollary (with B ~ Co(X) and A ~» Bo(H)). 


Finally, we show how to reconstruct A as a C*-algebra from A as a Hilbert A- 
module. The key to this is a more general construction: 


Definition C.78. The collection Cj(E,A) of “compact” operators on a Hilbert A- 
module E is the C*-algebra generated (within C*(E ,A)) by all operators of the type 


|p)(w|, where 0, W © E, and 
lp)(WI(S) = PCY, S)a- (C.162) 
Such operators are easily seen to be adjointable, with adjoint 
lp)(wl” =1¥)(@l, (C.163) 


and hence bounded, with norm majorized by || y||||@||. If E =H is a Hilbert space, 
then Cj(H,C) = Bo(H), since the maps |@)(y| obviously generate the finite-rank 
operators on H, whose norm-closure is Bo(H), cf. Proposition B.131. Hence the 
name “compact” operators, but in general elements of Cj (£,A) need not be compact 
(as operators on a Banach space) at all. The next and final example is a case in point: 


Proposition C.79. [f E =A as a Hilbert A-module in the usual way, then 
C9(A,A) SA. (C.164) 


Proof. We have |a)(b| = Lap, where a +> Lg is the canonical map from A to 
C*(A,A) C B(A) given by La(b) = ab, see Proposition C.30. This map is isomet- 
ric, cf. (C.63), and hence injective. The map |a)(b| + ab* from the linear span of 
all operators (C.162) within C}(E,A) to A is therefore bounded, and has dense im- 
age by Lemma C.47. Its unique continuous extension maps Cj(E,A) onto A, see 
Theorem C.62.5 (or use the Cohen—Hewitt factorization theorem to conclude). 
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C.11 Gelfand topology as a frame 


In the traditional approach to the Gelfand isomorphism, which we have followed 
so far, the Gelfand spectrum 2(A) of a commutative unital C*-algebra A is first 
constructed as a set, upon which it is equipped with a natural topology @(2(A)), 
i.e., the Gelfand topology. Alternatively, one may start with the latter and reconstruct 
(A) as a set from it. This not only gives a better conceptual understanding of 
Gelfand’s theory (relating it, for example, to a well-known construction in algebraic 
geometry); it also has the technical advantage of making good sense in constructive 
mathematics and hence in topos theory (which the classical theory does not). 

In the language of lattice theory, the topology @(X ) of any space X is an example 
of a so-called frame (cf. Appendix D, compared to which we change notation so as 
to avoid abuse of the ubiquitous symbol X) i.e., a complete lattice L in which 


UA\/S=\/{UAV,V € 5}, (C.165) 


for arbitrary elements U € L and subsets S C L. This is sometimes written in the 
form U A (V4 V_,) =Va(U AV, ), from which it is clear that the (binary) distributive 
law UA(V VW) = (UAV) V (UAW), which of course is implied by (C.165), is 
now required for arbitrary families. Indeed, the definition of a frame is primarily 
motivated by the example L = @(X), in which it should be noted that the supremum 


Vs=VUs=Ult € 5}, (C.166) 
a 


is simply given by the set-theoretic union of the elements of S, which are open sets 
whose union is open by definition of a topology, whereas the infimum of arbitrary 
families of open sets has to be doctored so as to make it open, and hence is given by 


AS=\ViU € 6X) |U CVV ES}. (C.167) 


Frame maps, then, are defined as order-preserving maps between the underlying 
posets that preserve finite infima and arbitrary joins. For example, if 


Q:Y>x (C.168) 
is a continuous map, then the inverse image map 
go |: 6(X) > 6(Y) (C.169) 


is a frame map. This also defines the category Frm of frames, whose opposite cat- 
egory (that has the same objects but all arrows inverted) is called the category Loc 
of locales. Thus a locale is a frame, seen as an object in the opposite category. If 
no confusion arises (which, unfortunately, is rarely the case), elements of Frm are 
written as G(X), even if they are not topologies (and indeed there are such frames, 
see below), in which case the corresponding element of Loc is written as X. 
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In this spirit, frame maps are always written as (C.169), in which case the map in 
the opposite direction between the corresponding locales is (C.168). This notation 
suggests the right way of thinking, and we will use it whenever it is convenient. 
Frames are very closely related to Heyting algebras, which were originally meant 
to formalize the intuitionistic (propositional) logic of Brouwer, and are defined as 
distributive lattices L (with top T and bottom |) equipped with a binary map 


> LxL—-L, (C.170) 
playing the role of implication in logic, that satisfies the axiom 
U<(V—>W) iff (UAV) <W. (C.171) 


Every Boolean algebra is a Heyting algebra, but not vice versa; in fact, a Heyting 
algebra is Boolean iff —4U =U for all U, which is the case iff (=U) VU = T for all 
U (which states the law of the excluded middle denied by Brouwer). In a Heyting 
algebra (unlike a Boolean algebra), negation is a derived notion, defined by 


WU =U 1. (C.172) 


A Heyting algebra is complete when it is complete as a lattice, in that arbitrary 
suprema (and hence also infima) exist. The infinite distributivity law (C.165) is au- 
tomatically satisfied in a complete Heyting algebra, which therefore is also a frame. 
Conversely, a frame may be turned into a complete Heyting algebra by defining 


VoW=\{U|UAV<W}. (C.173) 


Frames and complete Heyting algebras drift apart as soon as morphisms are con- 
cerned, for although in both cases one requires maps to preserve the partial order, 
maps between Heyting algebras must preserve — rather than infinite suprema. 

The map X +» @(X) from topological spaces to frames (which extends to a con- 
travariant functor in the obvious way, i.e., via (C.168) - (C.169)) is a competitor to 
the map X ++ Co(X) from topological spaces to commutative C*-algebras, and one 
goal of this section is to find out how these two constructions are related. 

First, there is a frame-theoretic analogue of the categorical duality between lo- 
cally compact Hausdorff spaces and commutative C*-algebras (cf. Theorem C.45), 
in which locally compact Hausdorff spaces are replaced by so-called sober spaces 
(and no restrictions on continuous maps are made), whilst the category of frames 
must be restricted to so-called spatial frames (which move is somewhat analogous 
to restricting C*-algebras to commutative ones). We now explain these notions. 

A particularly simple frame is 2 = {0,1} = {L, T}, with order 0 < 1; this is just 
the topology @(«) of a singleton *. In agreement of the above convention, a frame 
map p! : G(X) — 2 will be written as a locale map p : * + X. Such a map defines 
a point of the locale X (i.e., of the frame @(X)), and we denote the set of points of 
X by Pt(X). To appreciate this definition, let us suppose that O(X) is the topology 
of some space X. Each point x € X then corresponds to a genuine map 
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Dx i * 3X, px(*) =x (C.174) 


whose inverse image map p,! : @(X) — 2 is frame map and hence defines a point 
in the above sense. Conversely, if X is sober (see below), each point of @(X) arise 
in that way. The set Pt(X) has a natural topology, with opens 


Pt(U) = {p € Pt(X) | p(x) € U}, (C.175) 
where U € @(X); here p(*) € U really means p~!(U) = 1. This gives a frame map 
U + Pt(U) (C.176) 


from @(X) to Pt(X). We say @(X) (or the locale X) is spatialspatial if this map 
is an isomorphism of frames. Roughly speaking, therefore, spatial frames are just 
topologies (an example of a non-spatial frame is the lattice O,eg(R) of regular open 
subsets of R, 1.e., of open subsets U with the property --U = U, where —U is the 
interior of the complement of U). This does not mean, however, that any topology 
@(X) (seen as a frame) is isomorphic to O(Pt(X)), since Pt(X) may not be homeo- 
morphic to X. 

Spaces X for which this is the case are called sober; more precisely, this means 
that the map x+> p, from X to Pt(X) considered above is a homeomorphism; less 
precisely, we may say that sober spaces X may be reconstructed from their topology 
O(X), up to homeomorphism. To give a more direct topological characterization 
of sobriety, call W € G(X) meet-irreducible if UNV C W (where U,V € O(X)) 
implies either U C W or V CW. Inany space X, all open sets of the form W, = X\x— 
are meet-irreducible, where x € X (and x” is the closure of {x}). A space X is sober, 
then, iff these are the only such opens. For example, any Hausdorff space is sober 
(an example of a non-sober space is X = N with the unusual topology in which all 
complements of finite subsets are open, along with the empty set, of course). 

The category Frm, then, has a full subcategory Spat of spatial frames, whilst 
likewise the category Top of topological spaces has a full subcategory Sob of sober 
spaces. We now have the following counterpart of Theorem C.45: 


Theorem C.80. The categories Spat and Sob are dual, in that: 


1. If X is a sober space, then G(X) is a spatial frame. A continuous map 9 : Y — X 
induces a frame map ~~! : @(X) + G(Y) in the natural way, such that if we 
have another continuous map W:Z—Y , then(goy)-!=ywlog™!. 

2. If G(X) is a spatial frame, then Pt(X) is a sober space. Furthermore, a locale 
map 0: Y ->X (i.e. a frame map @~!: G(X) — G(Y)) induces a continuous 
function * : Pt(Y) > Pt(X) by 9*(p) = Gop (ie. g*(p-!) =p -!o@~!), which 
similarly behaves well under composition. 

3. There are canonical homeomorphisms and frame maps: 


px :X & Pt(O(X)), x px: (C.177) 
Pty : O(X) & O(Pt(O(X))), Urs Pt(U), (C.178) 
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cf. (C.174) - (C.176), with the correct naturality properties (cf. Theorem C.23). 
Proof. We will not give a complete proof of this, but the main points are that: 


e Any locale defined by the topology of some space is spatial. 

e Any space Pt(X) of points of some locale X (not necessarily a space) is sober. 

e The map px in (C.177) is a homeomorphism by definition of sobriety (which, 
alternatively, could have been defined by requiring bijectivity of px, in which 
case it can be shown that this map is continuous as well as open). 

e By definition of the topology on Pt(X), the map (C.176) is surjective for any 
locale X. If X is spatial; then for any distinct elements U,V € @(X) there is a 
point p such that p~!(U) 4 p7!(V), but this is the same as saying that Pt(U) = 
Pt(V) implies U = V. So in that case, (C.176) is also injective. 


Our aim is to apply these ideas to Gelfand duality, specifically to an independent 
description of the topology @(2(A)) of the Gelfand spectrum 2 (A) of a commu- 
tative C*-algebra A. To put this in perspective, let A for the moment be a general 
C*-algebra, and recall Definition C.55 of left, right and two-sided ideals (all taken 
to be closed by definition). Further to these, there is another interesting notion. 


Definition C.81. A hereditary subalgebra of a C*-algebra A is a C*-subalgebra B 
of A with the property that a < b for b € B* anda € A* implies a € B~. The set of 
of all hereditary subalgebras of A is denoted by H(A). 


It is a simple exercise to show that there are bijective correspondences between 
hereditary subalgebras B of A, left ideals L of A, and right ideals R of A, given by: 


L={aeA|atacB"}; (C.179) 
R= {a€A |aa* €B"}; (C.180) 
B=LOL*=RNR*. (C.181) 


Furthermore, one has /(A) C H(A), where J(A) is the set of closed two-sided ideals 
in A, and likewise we write L(A) and R(A). If A is commutative, these ideals are 
two-sided, so that L* = L etc., and L = R = B, so that H(A) =1(A) = L(A) = R(A). 


Proposition C.82. The set H(A) is a complete lattice under inclusion as the partial 
order, with inf and sup of any subset S C H(A) given by 


AS=[)s: (C.182) 
VS =(\{U € H(A) |V CUVV ES}. (C.183) 


Moreover, if A is commutative, then H(A) =I(A) = L(A) = R(A) is a frame. 


Proof. The defining conditions on hereditary subalgebras of A are preserved by ar- 
bitrary intersections, which means that H(A) has infima of arbitrary subsets, given 
by (C.182). This implies that H(A) also has arbitrary suprema, given by (C.183), 
which is a standard formula in lattice theory. Hence H(A) is a complete lattice. 
The last claim follows from Corollary C.84 below (and the ensuing fact for topol- 
ogy). It may also be proved directly, using the fact that H(A) = (A). 
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Proposition C.83. Let X be a locally compact Hausdorff space. Then the map 


O(X) = H(Co(X)); (C.184) 
U + (UV), (C.185) 


where Co(U) is seen as a subspace of Co(X), is a frame isomorphism, with inverse 


H(Cy(X)) > @(X); (C.186) 
Bw X\Fp, (C.187) 


where, for any subset B C Co(X) one defines the (necessarily closed) set Fg C X by 
Fp ={xEX | f(x) =O0Vf eB}. (C.188) 


Proof. For any open U € @(X), we may regard f € Co(U) as an element of Co(X) 
by extending f to all of X through f|x\y = 0. Continuity of f is only an issue at 
boundary points of US = X\U, so take xo € OU* (i.e., any neighbourhood of x has 
nonempty intersection with both U“ and U). Since f(xo) = 0, to prove continuity of 
f at xo we need to show that for any € > 0, there is neigbourhood N of xo such that 
| f(x)| < € for each x € N. Indeed, since f € Co(U), there is a compact set K CU 
such that |f(x)| < € for each x € U\K (and hence also for each x € X\K). Then 
xo ¢ K (since xp € U*), so, we may take the open neighbourhood N = X\K. 

Since the ordering in Co(X) is pointwise, it is trivial that Co(U) € H(Co(X)). The 
map (C.185) also clearly preserves the order, i.e., if U CV, then Co(U) C Co(V). 

Half of the proof that (C.185) and (C.187) are mutually inverse is the equality 


Co(U) = Co(X;X\U), (C.189) 
where for any F C X (usually taken to be closed), we define Co(X;F) C Co(X) by 
Co(X3F) = {f €Co(X) | fir = 0}. (C.190) 


To prove (C.189), we just need to prove that Co(X;X\U) C Co(U), since the oppo- 
site inclusion has been proved before Proposition C.83. Since f € Co(X), for each 
€ > 0 and each boundary point x € QU‘, there is an open neighbourhood N, of x 
where | f| < €, as well as a compact set K C X outside which the same is true. Then 
V = UyeayeU, NU is open in U, so that its complement U\V is closed in U, and 
K' = (U\V) OK is compact in U. Clearly, |f| < € outside K’, whence f € Co(U). 
Having proved (C.189), the other half of the proof of bijectivity of (C.184) is 


B =Co(X;Fs), (C.191) 


for any B € H(Co(X)). The inclusion B C Co(X; Fg) is trivial. For the converse, we 
exploit the fact that B is an ideal in Co(X), so that Co(X)/B is a C*-algebra by 
Theorem C.60. Let t : Co(X) — Co(X)/B be the canonical projection. If f ¢ B, then 
t(f) £0. Hence there is a character w’ € X(Co(X)/B), such that w'(t(f)) £0. 
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Lift @' to @ = @'0t € X(Co(X)) =X, so that there is x € X such that @(g) = g(x) 
for all g € Co(X). Since t(g) = 0 for each g € B, we have @(g) = 0, and hence 
g(x) = 0 for each g € B, so that x € Fg. But f(x) £0, so f ¢ Co(X; Fg), and hence 
we have proved the inclusion Co(X; Fg) C B. 


Thus C.83 could just as well have been formulated in terms of closed sets, albeit at 
the cost of inverting the partial order. Also, note the isomorphism 


Co(X)/Co(U) = Co(X\U), Uf] Fix: (C.192) 
Corollary C.84. For any commutative C*-algebra A, there is a frame isomorphism 
O(2(A)) = H(A). (C.193) 


This sheds new light on maximal ideals in A as points of the Gelfand spectrum 2 (A), 
cf. Proposition C.13. We need a lemma that applies to any frame @(X). A prime 
element P © G(X) is an element P#T such thatUAV < PiffU <PorV <P. 
For a point p~! : G(X) — 2, we write ker(p~!) for {U € @(X) | p-'(U) = 0}. 


Lemma C.85. For any frame @(X) (i.e. locale X), there is a bijective correspon- 
dence between points p~! : G(X) — 2 of X and prime elements P € G(X), viz. 


P = \/ker(p“'); (C.194) 
p '(U) =O0iffU <P. (C.195) 


Under this correspondence, the topology on Pt(X) is given by the Zariski topology, 
whose closed sets Fp consist of all Q > P, where P is some prime element of G(X). 


Proof, The requirement that p~' be a frame map implies the following properties 
of its kernel K = ker(p!): T ¢ K, UAV € K iff U € K or V EK, and VS €K iff 
each V € Sis in K. Any subset K C @(X) satisfying these properties in turn defines 
a point p of X whose kernel is K. Then P = \/K is a prime element of @(X), and 
conversely, K (and hence p) may be recovered from P as its downset K =| P. 
The given topology on the set of prime elements is a rewriting of (C.175). 


The prime elements of H(A), where A is a commutative C*-algebra, are the prime 
ideals in A, i.e., the proper ideals J C A such that Jj Jz C J iff J) C A or Jo CA, for 
any ideals J) ,J2 of A (closed by definition, like J); note that JjJ2 = J, NJ2. 


Theorem C.86. 1. The frame H(A) of hereditary subalgebras of a commutative C*- 
algebra A is spatial, with Pt(H(A)) = (A) as topological spaces. 
2. The prime elements of H(A) are the maximal ideals of A, so that, equipping the 


set M(A) of maximal ideals of A with the Zariski topology, also M(A) = Z(A). 


Proof. 1. Proposition C.83 bijectively relates prime elements in H(A) to meet- 
irreducible sets in Y (A). The description of sobriety in terms of meet-irreducibility 
after (C.176), which applies because £(A) is locally compact Hausdorff and 
hence sober, then bijectively relates these meet-irreducible sets to points of Y(A). 

2. Proposition C.13 in turn relates points of Y(A) to maximal ideals of A. 
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Having understood the structure of commutative C*-algebras, we now turn to the 
general case. We already know that the algebra B(H) of all bounded operators on 
some Hilbert space H is a C*-algebra in the obvious way (i.e., the algebraic op- 
erations are the natural ones, the involution is the operator adjoint a +> a*, and 
the norm is the operator norm of Banach space theory). Moreover, each (operator) 
norm-closed *-algebra in B(H) is a C*-algebra. Our goal is to prove the converse: 


Theorem C.87. Each C*-algebra A is isomorphic to a norm-closed *-algebra in 
B(H), for some Hilbert space H. Equivalently, for any C*-algebra A there exist a 
Hilbert space H and an injective homomorphism 1: A + B(A). 


A homomorphism 2 : A + B(H) is called a representation of A on H. The equiva- 
lence between the two statements in the theorem follows from Theorem C.62. 

Let us note that Theorems C.8 and C.87 harmonize as follows: any measure 
on X satisfying u(U) > 0 for each open U C X leads to an injective representation 
of Co(X) on L?(X, 1) by multiplication operators, that is, #(f) = mr, cf. (B.238). 

The proof of Theorem C.87 uses the elegant GNS-construction, named after 
Gelfand, Naimark, and Segal, which is important in its own right. We initially as- 
sume that A is unital. First, we call a representation 7 cyclic if its carrier space H 
contains a cyclic vector Q for 7, i.e., the closure of 2(A)Q coincides with H. 


Theorem C.88. Let @ be a state on a C*-algebra A. There exists a cyclic represen- 
tation Nw of A on a Hilbert space Hy with cyclic unit vector Qy such that 


O(a) = (Qo,%o(a)Qo), aA. (C.196) 


Proof. We first give the proof in the special case that A has a unit 14, and @(a*a) >0 
for all a 4 0. Define a sesquilinear form (—,—) on A by 


(a,b) = w@(a*b). (C.197) 
This form is positive definite by definition of a state, so that we may complete A in 
the ensuing norm 
Ila\|o =  @(a*a), (C.198) 
to a Hilbert space called Hw. For each a € A, we then define a map 


Tw(a):A > A; (C.199) 
To(a)b = ab. (C.200) 


Regarding A as a dense subspace of Hq, this defines an operator 7 (a) on a dense 
domain in Hw. This operator is bounded, since (C.94) implies 


||70(a)|| < llall- (C.201) 
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Hence 2(a) may be extended from A to Hq by continuity, and we obtain a map 
Tw :A — B(Hq). Simple computations show that 2 is a representation. The special 
vector Qe is the unit 14 € A, seen as an element of Hq: its cyclicity is obvious, and: 


|| Qol|? = (Qo, Qo) = @(14 1a) = @(14) = 15 (C.202) 
(Qe, (a) Qu) = o(14al,) = o(a). (C.203) 


Under our standing assumption @(a*a) > 0 if a 40, this not only proves Theorem 
C.88, but also Theorem C.87: for Z@(a) = 0 implies ||@(a)Qq||* = 0, whose left- 
hand side is precisely (Qo, Zw (a*a)Qy) = @(a*a). Thus Zq is faithful. 

In general, a C*-algebra may lack such states, and we must adapt the proof of 
both theorems. The GNS-construction is easy: for an arbitrary state @, we introduce 


No = {a € A| @(a*a) = 0}. (C.204) 
If dq is the image of a € A in A/N@, we may define an inner product on the latter by 
(d@,Do@) = @(a*b); (C.205) 


this is well defined and positive definite, and we define the Hilbert space Hw as the 
completion of A/Nj in this inner product. Furthermore, we define 


To(a):A/No > Ho: (C.206) 
To(a)bo = (ab)e: (C.207) 


this is well defined, because Nw is a left ideal in A by (C.94). Finally, we define 
Qo = (la)a- (C.208) 


The proof that everything works is then a simple exercise. Another way to look at 
the cyclic vector Qe is to let @ define a linear functional © : A/N@ — C by 


@(aw) = (a); (C.209) 


this functional is continuous on A/Ng C Hw, because |@(a)|? < w(a*a) = lawl 
as follows from the Cauchy—Schwarz inequality for the positive semidefinite form 
(C.197). Hence by Riesz—Fréchet there is an implementing vector Qq such that 


(a) = (Qo, ae). (C.210) 


Finally, when A has no unit, in defining Qy we either use the GNS-construction for 
the unitization A and restrict 7@(A) to A to define (A), or use (C.210). 


One of the nicest feature of the GNS-construction is the link between purity of 
the state w and irreducibility of the corresponding representation Zw. 
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Definition C.89. We call a representation 1 of a C*-algebra A on a Hilbert space 
H irreducible if the only closed subspaces K of H that are stable under (A) (in 
the sense that if w € K, then n(a)w € K for all a € A) are either K = H or K = {0}. 


Theorem C.90. Each of the following conditions is equivalent to irreducibility: 


1. n(A)' =C-1, where S' is the commutant of S C B(H) (Schur’s Lemma); 
2. m(A)” = B(H); 
3. Every vector in H is cyclic for 1(A). 


Furthermore, if @ is a state on A, then @ is pure iff the corresponding GNS- 
representation Ny is irreducible. 


Proof. If m(A)’ 4 C-1, then 2(A)’ must contain a nontrivial self-adjoint element 
a (as it is a *-algebra), and hence also a nontrivial projection e (as the spectral 
projections e, = 1,4(a) of a, defined as in Theorem B.102, lie in 2(A)’, too). But if 
e € @(A)’, then eH is stable under 2(A), and hence z cannot be irreducible. Thus 
irreducibility implies 1. Conversely, if (A)’ = C-1, then a must be irreducible by 
the same argument, since if not, any projection onto some proper stable subspace 
K for x would be an nontrivial element of 2(A)’. The equivalence 1 < 2 is clear, 
since (C- 1)! = B(A). Similarly, if @ € H would fail to be cyclic for z, then 1(A)g@— 
would be a proper, 7(A)-stable subspace of H, so that irreducibility implies 3. The 
converse is trivial, since if K C K were stable for 2(A), then 3 cannot hold. 


Another useful result relates general representations to GNS-representations. We 
call two representations 7; : A — B(H;), i= 1,2, unitarily equivalent if there is a 
unitary u : H; — Hp such that um (a)u* = m2(a) (or um (a) = m(a)u) for eacha € A. 


Proposition C.91. Let 2: A > B(H) be a cyclic representation of H. If w € H is a 
cyclic unit vector for 1, then 


oa) = (y,n(a)y) (C.211) 
is a state on A, whose GNS-representation Tw is unitarily equivalent to 7. 


Proof. Define u: Hg — H first on 17@(A)Qq@ (which is a dense subspace of H) by 
UN (a) Qa = Aa). (C.212) 
Using (C.211) and (C.196), we then obtain 
||%(a)Qal|* = o(a*a) = (y,x(a*a)y) = ||x(a)y||?. (C.213) 


This shows that u is well defined as well as isometric, so that it extends to Hw by 
continuity. Its image is then the closure of 27(A)y, which is H, since y is cyclic by 
assumption. Thus u is surjective and hence unitary. Finally, we compute 


UT (4) To (b) Qe = X(a)a(b)y = 2(a)ute(b) Qo, (C.214) 


so that u%@(a) = 2(a)u on the dense space %@(A)Qo, and thence everywhere. 
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We now take up the proof of Theorem C.87, preceded by some general remarks 
on direct sums of Hilbert spaces and representations. First, if (H),...,H,,) is a finite 
family of Hilbert spaces, one may form the direct sum H = H, @--- ® Mh, initially 
merely as a vector space, and subsequently also as a space with inner product 


((@1,+++5Pn)s(Wi,-+-5Wn)) = ) (Gi, Vi). (C.215) 


U 


il oe 


It is easy to see that H is complete in the ensuing norm 


(Wiss Wal? = ¥ ll yall’. (C.216) 
i=l 


Some authors write Wj 6---B Wr, Wi +--+: +Wy, or WE +--+ + WY for (W,.--, Wn). 
Moreover, if (7;) is a family of representations 7; : A > B(H;), then one obtains 
a new representation @; 7; of A, called the direct sum of the 7;, by 


Dzila)(Yis---+ Yn) = (A1(4) Yi, Fn(@)Yn)- (C.217) 


This construction works for arbitrary families of Hilbert spaces (H,.) and represen- 
tations (7,), where x € X for some index set X. First, the elements of H = @, Hy 
are families (W) = (Wy)xex, where y, € H,, such that 


I(y)I|7 = sup Y° vel, <e, (C.218) 


FCX xeF 


where the supremum is over all finite subsets F of X, so that the sum is defined as in 
(B.11). In that case, the obvious linear operations (i.e., ((W) + (@)), = W%+ @, and 
(A(W))x =A- W,) are defined within H, since for each pair (@),(w) € H we have, 
from the triangle inequality for the norm in each finite direct sum Hr = @ycr Hy, 


1/2 1/2 1/2 
(Ev ei) < (x iva? (Em?) < ICIP +I1@)IP. 
xXCF 


x€F xeF 


The supremum over F gives ||(w) + (@)||, which is therefore finite and satisfies the 
triangle inequality for the norm. Similarly, the natural inner product in H is well 
defined, this time by the full Definition B.6, with V = C and f(x) = (@,, We) xH,, ie., 


((@),(W)) = Y (es Ver)an- (C.219) 


xEX 


To see this, we apply Cauchy—Schwarz first in each H, and then in ¢?(X) to obtain 


(9), (WS Ye Ge, Vad SY Mell Vell SMCPIINCY)I <2. (C.220) 


xEX x 
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Finally, the proof that the direct sum Hilbert space @,.H; is complete in the norm 
(C.218) is similar to the case where H, = C for each x, ie., H = 2 (X), cf. Theorem 


B.9. Let (yw), be a Cauchy sequence in H, consisting of sequences (Wy)n = yl”) in 


each H,. For each finite F C X and € > 0, we must have ) .<- || yi” - yo” || < € for 
sufficiently large n,m so that each (Y,), must be Cauchy in H,, with limit y,. The 
ensuing set (y) of vectors lies in H by the argument following (B.19), and the given 
Cauchy sequence (y), converges to (y), again by the same proof as for ¢7(X). 

If one has a family (7,) of representations 2, : A + B(H,), their direct sum 7 = 
@®,.%, defined by (1(a)(W)), = %(a) Wy, is a representation of A on H. Indeed, one 
has ||(a)|| = sup,{||a(a)|| }, and since we have ||2,(a)|| < ||a|| for each x, we also 
have ||z(a)|| < ||a||, so that z(a) € B(H), and hence 2 maps A into B(H). 

Our first use of such direct sums shows that cyclic representation are the building 
blocks of any representation 7, at least if we require 7 to be nondegenerate in the 
sense that 2(a)y = 0 for alla € A and y € H implies y = 0. 


Proposition C.92. Any nondegenerate representation m: A —> B(H) of a C*-algebra 
A ona Hilbert space H is a direct sum of cyclic representations of A. 


Proof. Consider families (Y,).<x of nonzero vectors in H with the property that 
(1 (a) Wx, (b) Ws) =0, (C.221) 


for all a,b € A and all x 4 x’. Such families are partially ordered by inclusion, and 
an easy application of Zorn’s Lemma shows that there is a maximal such family. 
For this family (W,)xex, we define H, as the closure of 7(A)y, in H. Since 7 is a 
homomorhism, each H, is stable under 2(A), and hence the restriction 2,(a) of a(a) 
to H, defines a representation of A, which is cyclic by construction. It follows that 
H = @,,.A and x = @, 7, and so the claim has been proved. 


Our second use is the proof of Theorem C.87, where we have to solve the problem 
of the possible lack of injectivity of 7 in our previous preliminary proof. 


Proof. To do so, we replace Hg by the crazy Hilbert space He = Bwep(a) Ho, where 
P(A) is the pure state space of A. The Hilbert space H, carries a representation 7 = 
@wer(a) Zo- The point is that if t(a) =0, then 1(a*a) Qo = 0 for each @ € P(A), 
which by (C.196) implies @(a*a) = 0. Proposition C.15 then gives o(a*a) = {0}, 
from which the spectral radius formula (C.55) gives ||a|| = 0, and hence a = 0. It 
follows that 7 is injective, and Theorem C.87 is proved. 


It should be noted that this proof relies on shock and awe kind of overkill (though 
nothing compared to the even crazier space Hee = Bc 5(A) Ho, which is tradition- 
ally used in the above proof), in that H, is far larger than necessary (indeed, in all 
but the most trivial cases, H is non-separable). For example, already for A = M2(C) 
we have P(A) = S?, so that H. = Docs? C?; this Hilbert space is non-separable, 
whereas A has an injective representation on C*. More generally, Bo(H) or B(H) 
has an injective representation on H by definition, whereas H, is non-separable. In 
the commutative case, A = Co(X) yields the non-separable H. = @,<x C, although 
A has an injective representation on the (typically) separable space L7(X, 1). 
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detail (cf. §1.5 for the simple case where X is finite). If U is some state on Co(X), 
then by Theorem B.24, there is a unique probability measure UW on X such that 


w(f)= | aus, feColX), (C.222) 


cf. (B.39). It follows from (C.204) and (C.222) that 


No = {Fe Go(x)s [ay Poh. (C.223) 


In particular, the support of pt is X iff No = {0}, in which case A/N@ = Co(X). 
In the opposite case where @ is a pure state, i.e., @ = @, for some x € X, with 
O,(f) = f(x), one has No = {f € Co(X) | f(x) = O}, so that A/N@ = C, under the 
map [f] ++ f(x). In general, from (C.206) - (C.207) we obtain 


Ho = L’(X, uM); (C.224) 
To(f) = mg; (C.225) 
Qe = lx, (C.226) 


where mry = fy, cf. (B.238). Analogously to (B.331), we then obtain 
Tew (Co(X))" = L*(X, 1). (C.227) 


The state @, initially defined on the commutative C*-algebra Co(X), then has a 
normal extension to the commutative von Neumann algebra L®(X, 1), cf. (C.222). 

More generally, if A is an arbitrary commutative C*-algebra and @ is a state on 
A, then, writing (A) for the Gelfand spectrum of A as usual, we have 


Ho = L?(Z(A), H); (C.228) 
To(f) = mp: (C.229) 
Qe ~ ly), (C.230) 


where f € Co(Z(A)) is the Gelfand transform of f € A, and y is the probabililty 
measure on L(A) defined by 


o(f) = I alt (C.231) 


With this commutative case in mind, some authors would call a pair (A, @), where 
A is a general C*-algebra and @ is a state on A, or, alternatively, A is a general 
von Neumann algebra and @ is a normal state on A, a non-commutative proba- 
bility space. As such, ‘aordinary” probability theory (at least, on locally compact 
Hausdorff sample spaces) is merely the commutative case of a much more general 
“non-commutative probability theory”. 
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If H4 and Hg are Hilbert spaces, their algebraic tensor product H4 ® Hz typically 
fails to be a Hilbert space in the obvious way, since it is not complete (unless one 
of the factors is finite-dimensional). Similarly, the algebraic tensor product A © B of 
two C*-algebras A and B usually fails to be a C*-algebra. However, the second case 
is far more complicated then the first: for Hilbert spaces there is a canonical norm 
on the algebraic tensor product and hence a canonical completion of H4 & Hp into a 
Hilbert space H4®Hpg. For C*-algebras, on the other hand, there is an embarrasment 
of riches, in that there are are many norms turning the completion A®B of A®B in 
some such norm into a C*-algebra. However, if A or B is nuclear, there is just one 
possibility; see below. For example, this applies of A or B is finite-dimensional. 
Let us first review the (algebraic) tensor product of two vector spaces. A and B. 


Proposition C.93. Let A and B be (complex) vector spaces. There is a vector space 

called A ® B, in words the algebraic tensor product of A and B (over C), and a 

map p: AX B-+A®B, such that for any vector space C and any bilinear map 

B:AxB-C there is a unique linear map B’ : A®B—>+ C such that B = B' op. 
In other words, the following diagram commutes: 


Axtp—es A@B 
[29 (C.232) 
B 
C 


This universal property also shows that A ® B is unique up to isomorphism. 


Proof. In preparation for an explicit construction of A ® B, define the (complex) free 
vector space on any non-empty set X asC,.(X), where X has the discrete topology 
(i.e., C.(X) consists of all functions f : X — C with finite support), and pointwise 
operations. For each y € X, the delta-function 5, € C,(X) is defined by 5,(x) = d,, 
so that each element f of C.(X) is a finite sum f = );A,6,,, where A; € C and x; € X. 

If A and B are (complex) vector spaces, A ® B is the quotient of the free vector 
space C,(A x B) on X =A x B by the equivalence relation generated by the relations: 


5(a,+ay,b) ~ S(ay,b) + S(ay,b)3 (C.233) 
5(a,b\ +b) i 5(a,b1) v S(a,b2)3 (C.234) 
NG(av) ~ FAa,b)3 (C.235) 
RGigyte oe any: (C.236) 


Fora €A,b € B, the image of 5(a,b) in A @B is called a® b, so that by construction, 


(ay +a) @b=a,8b+a,@b; (C.237) 
a® (bi +b2) = a@by +a@by; (C.238) 
A(a@b) = (Aa) ®@b=a@ (Ab). (C.239) 
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Elements of the algebraic tensor product A ® B may therefore be written as finite 
sums c = Ya; ®b;, with a; € A, b; € B, subject to the above relations. 
Now consider some bilinear map B : A x B > C. We extend B to a map 


B:C.(Ax B) > C; (C.240) 
B (EAaa0] =) 4B (ai,bi). (C.241) 


Since B is bilinear, it respects the above equivalence relation, so that it duly quotients 
to B’: A@B-C, upon which the property B = Bo p holds by construction. Finally, 
since p is surjective the latter property uniquely determines f’. 


Equivalently, A ® B is the quotient of formal sums );(a;,b;) by the subspace 
consisting of those sums for which there are @4 € A* and Wg © B* such that 
¥; @p(a;)@p(b;) = 0. Similarly, it is useful to regard A ® B as a subspace of the 
vector space L(A*, B) of linear maps from the dual A* to B through the map 


Yi gj @b): OY @a(aj)b; (@4 € A*); (C.242) 


this map is injective by Corollary B.45.2, since we may assume the J; to be linearly 
independent. Using the canonical embedding B — B** of Proposition B.44, this in 
turn yields an injection A ® B > L(A* x B*, C), i.e., the space of bilinear maps from 
A* x B* to C, given on arguments (@,4, pg) by 


Yai @ bj: (M4, Wp) Ds 4 (a;) Op(D;). (C.243) 


U 


Proposition C.93 turns this into an injection A @ B > L(A* @ B*,C), given by 


Y\ aj ® bj : Y\(@4) | ® (@g) | + Y°(@a) j (ai) (@p) j(bi)- (C.244) 
i j ij 


If A and B are Hilbert spaces, we call them Hy and Hg, denote their elements by 
a and f, respectively, and attempt to define a sesquilinear form on Hy © Hz by 


(Y. & ® By, 0 a ® Bi) = Y (o¥}, ;)4(B}, Bids (C.245) 


ij 


It is a non-trivial fact that this form is well defined, because representations )°; a1 ® B; 
of vectors in H4 ® Hg may not be unique. For example, if H4 = Hg = H = C", and 
(a) and (a!) are two bases of H, then Y); 0% ® a = Y.; of @ at} (to see this, take inner 
products with an arbitrary elementary tensor y ® @, yielding the same result). 

To resolve this, we note that the injection Hy ® Hg — L(Hi x Hz,C) just dis- 
cussed combines with the isomorphism H* = H of Theorem B.66 to an injection 
Hy, ® Hp — L(Ag x Hp,C), ie., the space of bi-anti-linear maps from H4 x Hp to 
C. Proposition C.93 turns this into an injection Hy, @ Hg > L(H4 ® Hp, C), viz. 
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ya @ Bi: a, @ Bi) Lita, 041) #1, (B} Bi) Hp: (C.246) 
1 J i,j 


Consequently, if )); o; ® B; = 0, then the right-hand-side of (C.245) is zero, too, since 
it is the image of }); ar’ ® Bi under the zero map. Hence (C.245) is independent of 
the choice of representatives in the sum Y; 0; @ B;, and by hermiticity of the form, 
this equally well applies to the other entry )); a ® Bi. 

It remains to show that (C.245) is an inner product, i.e., that it is positive definite. 
To see this, for some given vector )'; O ® B; in H4 @ Hg one may take the linear span 
Hi of all a in H4, which is a Hilbert space, and pick a basis (v;) in ae Absorbing 
the scalars in the Bj, we may therefore write Y; a ® Bj = Lx Vx ® By’, so that 


(Ya ® Bi, Y) a @ Bi) = V(% ® Bi’, 1 @ B/) = YIBi lb >0, = (C.247) 
i i kl k 
with equality at the end iff each B/’ = 0, and hence );; a; ® Bj = 0. 
Finally, we complete H4 ® Hg in the norm defined by the inner product (C.245); 


with abuse of notation the ensuing Hilbert space is often just called H4 ® Hg, but it 


would be more precise to denote it by H4®Hpg, as we will usually do. 

It is easy to show that if (vo) and (v!”)) are bases for H,4 and Hg, respectively, 
then (vw ® vi?) is a basis of H,@Hpg. Also, if (X,2,p) and (X’,2’,y’) are o- 
finite measure spaces with X and X’ well behaved (e.g., Polish), so that the L?-spaces 
are separable, one has a natural isomorphism 


L?(X,E,w)@L (x2 pw) = V(X xX ExT xy’), (C.248) 


obtained as the closure of the isometric (and hence bounded) map that sends the 
vector )); Wi ® yj into the function (x,x’) 4 LY; Wi(x) yj (x’) on X x X". Here Y x L! 
is the smallest o-algebra on X x X’ that contains all sets A x A’, A € 2, A’ € E', and 
Lt Xx pw’ is the familiar product measure defined on elementary measurable sets by 


wx p'(Ax A’) = u(A)u'(4’). (C.249) 


We now turn to tensor products of C*-algebras. If A and B are C*-algebras, then 
the algebraic tensor product A @ B of A and B (just seen as vector spaces) is endowed 
with a natural multiplication and involution, given by linear extension of 


(a, @bj)- (ay @b2) _ (a, az) ® (by b2); (C.250) 
(a@b)* =a @b", (C.251) 


respectively. Thus A ® B is a *-algebra, and Proposition C.93 specializes to: 


Proposition C.94. If C is a *-algebra and if a bilinear map B : A x B > C satisfies 
B (a1a2,b1b2) = B(a1,42)B(b1,b2); B(a",b*) =B(a,b)*, (C252) 


then B factors through A ® B (now seen as a *-algebra), as in (C.232). 
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The proof is similar. In order to turn A ® B (seen as a *-algebra) into a C*-algebra, 
we need a C*-norm, i.e., a norm on A ®B satisfying the C*-axioms (C.1) - (C.2). 
If such a norm exists, we denote the completion of A ® B in that particular norm by 
A®B, where typically || - || and hence ® carry some label. This completion A®B is 
a C*-algebra in the obvious way. There will be no shortage of such norms! 

For example, suppose A C B(H4) and B C B(Hg). For each a € A, we form the 
operator a® 1g on Hy, ® Hp (where 1, is the unit of B(Hg), which is also the unit of 
Bifit has one). As in (C.247), we may assume that generic elements of H4 ® Hp take 
the form Y; Vy ® B;, with the vg orthonormal in H4 and f; € Hg. We then estimate 


2 


Vi (avg) ® Bx 


k 


“<EIK avy) ® Bgl? 


2 


(a® lp) (Enon) 


< |la||? (C.253) 


Hence a® 1g is bounded on the pre-Hilbert space Hy & Hg, and extends to a bounded 
operator on H4®Hg by continuity; this extension is usually called a ® 1g, too. Sim- 
ilarly, any b € B defines a bounded operator 1 ®b on H4@Hg, and since 


a@b =(a@ 1g): (14 @b), (C.254) 


all elements )';a; © b; of A ® B extend to elements of B(H,®Hg). Now define 
|| P41 ® billmin = || Lai ® Dillan gsr): (C.255) 
This is clearly a C*-norm on A @ B. Moreover, it is a cross-norm, in that 
|| ® B||min = || |||]. (C.256) 


This construction generalizes to any two C*-algebras, since by Theorem C.87 we 
have injective representations 74 : A — B(H,) and ag : A > B(Hpg) of A and B, 
respectively, and it is easy to verify that the norm ||- || min on A ®B and ensuing 
completion A®minB are independent of the chosen representation. Furthermore, 


I[¢l|min = sup{ || ® %B(c)|lauySHg)}> (C.257) 


where Z, and 7g run through all representations of A and B, respectively. The en- 
suing completion A®minB is called the injective tensor product of A and B. Without 
proof (which requires more advanced methods than the elementary arguments we 
use in this section), we mention that, as its name suggests, || - ||min is the smallest 
C*-norm on A ® B. This has a very important consequence: 


Proposition C.95. Any C*-norm || - || on A@B satisfies ||a® b|| = |la||||D|}. 


In other words, any C*-norm || - || on A @ B is a cross-norm. To prove this from the 
minimality of the spatial norm, we need a lemma of wider interest. 
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Lemma C.96. /f || - || is any C*-norm on A ® B, then for alla € A and b © B, 
||a@ || < |lal||lell. (C.258) 


Consequently, for any C*-norm on A® B and any c € A@B, we have the bound 


IIcl| <int{ Ffeillolhe =F}. (C.259) 


Proof, In any C*-algebra A, if a > 0, we have |la|| < 1 iff a* <a. This is trivial for 
A =C(X), and in general can be proved within C*(a) C A, since C*(a) = C(o(a)). 
Now take a € A and b € B such that a > 0, b > 0, |la|| < 1, and ||b|| < 1, so that 
(a@b)* =a? @b? <a®b* <a@b, and hence ||a@bl| < 1. For general a >0,b > 0, 
rescaling to a/||al| etc. gives (C.258). For general a, b altogether, we compute: 


||a@d|? = ||(a@b)*(a@b)|| = |la*a@ b*d|| < |la*al|||b*b|| = |lal|*||||°. (C.260) 


Eq. (C.259) then follows from the triangle inequality on the norm. 


If A and B each have a unit, there is a simpler proof: as in (C.254), we have 
||a@ || = ||(@@ 1)(14 ®4)|| < |la@ 1p|||] 14 @ HI] = alll, (C.261) 


where we used ||a ® 1g|| = ||a|| etc., which is the case because the map a+>a® lg 
from A to ASB is injective and hence is an (isometric) isomorphism onto its image. 
We now prove Proposition C.95. 


Proof. For any C*-norm || - ||, we have ||a @ b|| > ||a@ b||min = |la||||D||, since the 
spatial norm is itself a cross-norm, cf. (C.256). Then (C.258) gives equality. 


In view of (C.259) and the existence of at least one C*-norm on A ® B (namely 
the spatial one), it makes sense to define the maximal C*-norm on A® B by 


= sup 
max 


This is clearly a C*-norm, and hence it is also a cross-norm. i.e., 


Yi a; @ b 


Yi a; @ bj 


|| - |] is a C*-norm ooace} .  (€.262) 


|| @ b|max = lal ||). (C.263) 


This property may be proved without using the deep result that the spatial norm is 
the minimal one (which in turn led to Proposition C.95); all we need is the inequality 


\|e||min < \|c|| max; (C.264) 


for any c € A®B, which follows from the definition of || - || max, upon which (C.264) 
may be proved in the same way as for general C*-norms. The completion A@maxB 
of A ® B in the norm || - ||max is called the projective tensor product of A and B. 
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If we define representations of the pre-C*-algebra A ® B on Hilbert spaces in the 
same way as for C*-algebras, i.e., as linear maps 2: A ® B — B(H) that preserve 
the product (C.250) and the involution (C.251), we obtain 


[el] max = sup{||7(c) ||}, (C.265) 


where c = Ya; ®b; € A@B, and % runs through all representations of A ® B. Indeed, 
according to Theorem C.87 there exists an injective representation 7 of A@QmaxB, 
so that ||cl|max = ||2(c)|| for each c € A@maxB, and hence also of each c € A® 
B. Furthermore, any representation of A ® B yields a cross-norm, so that (C.265) 
follows. This also shows that the supremum in (C.265) is actually attained. 

In what follows, we restrict ourselves to the case that A and B have a unit, which 
suffices for our applications, but the claim is true in general (with a slightly more 
complicated proof, involving either approximate units or unitizations). If A and B 
each have a unit, so does A ® B, viz. 14 © 1g. States @ on A & B are then defined 
as for unital C*-algebras, i.e., as positive linear functionals (in the usual sense that 
@(c*c) > 0 for any c € A @ B) that map the unit 14 @Ag of A@B to 1. 


Proposition C.97. Let A and B be unital. Then each state on A®B is continuous 
with respect to the ||-|| max-norm, and hence extends to a state on the maximal tensor 
product A®maxB. Thus identifying states on A®B and on A® maxB, we have 


S(A@B) = S(A®maxB). (C.266) 


Proof. Let @:A®B-— Ca state. Although A © B may not be a C*-algebra, the 
GNS-construction Theorem C.88 goes through as if it were. The reason is that the 
only delicate point, namely boundedness of 7(a ® b), may be proved from (C.94), 
just as in the usual case. Indeed, for a € A, b € B, andc € A@B, we estimate 


|| 20 (a®b)c@||? = @(c*(a@b)*(a@b)c) = w(c*(a*a @b*b)c) 
|la||* |||? @(c*c) = |la)* [15117 lIcooll? 


= || @ b|| max||cal|*, 


IA 


so that ||%(a® b)|| < ||la @ d| max, and hence 14(a® b) may be extended to the 

completion Hw» of (A®B)/N@ by continuity. Here we used the facts that: 

e (a@b)*(a@b) =a*a®b*b, so that the right-hand side is positive in A @ B. 

e 0<a‘a <|la\|714 and 0 < b*b < ||b||7 1g, as A and B are C*-algebras, cf. (C.83). 

e Ifc’ >0inA @B, then c*c'c > 0, as for C*-algebras, see the argument preceding 
(C.93). The argument is the same: c*c'c = c*d*dc = (dc)*dc > 0. 

Wriiting Qo = (14@1z)q for the cyclic vector of Hy, as in (C.208), for any element 

c €A@B we obtain, using (C.265) in the final inequality, the decisive bound 


|@(c)| = |(20; %o(c)Qe)| < ||o(C)|I < |lel| max. (C.267) 


In other words, @ is continuous with respect to the || - || max-norm, and since the latter 
is dense in A®maxB, the state extends to the completed tensor product by continuity. 
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It follows from (C.267) and @(14 ® 1g) = 1 that ||| = 1 as a functional on A @ B 
equipped with the || - ||max-norm, so that the in question extension has the same 
norm, and hence by Proposition C.5 is a state on A®maxB. Conversely, a state on 
A ®max B restricts to a state on A ® B, since the two *-algebras have the same unit 
and (trivially) if c is positive in the latter, then so it is in the former. 


The above proposition concerns extensions of arbitrary states on A ® B. However, 
product states on A ® B can be extended to any completed tensor product A&B. 


Proposition C.98. [f @4 and Wz are states on A and B, respectively, then the corre- 
sponding product state M4 © Wg on A ®B, defined as in (C.243) by 


4 © Op (x aj ® ) = y (aj) @p(bi), (C.268) 


, and hence extends to A®B. 


is continuous with respect to any cross-norm || - 


Proof. Since the spatial norm is minimal among all cross-norms, it is enough to 
prove continuity with respect to || - ||min- As in the proof of Proposition C.97, we 
form the GNS-representation 1m,@@, induced by @,4 & sg, so that for any c CA @B, 


(M4 & @p)(c) = (Qe, a0p8> No, ® WB (c) Qa, 008): (C.269) 


Now consider the representation 7, (A) ® Xo, (B) on Ho, ® Ho,, with cyclic vector 
Qo, ® Qo,. Writing c = 4; © b; as usual, a simple computation gives 
(Qa, ® Qoz; (Ta, ® Tap )(C) Qo ® Qoy) 
= V (Qe4 sey (Gi) 20g) (Qeog » Hog (Bi) Qeg) = Y° Oa (ai) 8 (bi) 


= (@4 ® @g)(c). (C.270) 


Using the same reasoning as in (the proof of) Proposition C.91 (which does not ap- 
ply literally, since it is about C*-algebras), it follows from (C.270) that 1,0, (A ® 
B) is unitarily equivalent to 1, (A) ® Xe, (B), so that, using (C.270), analogously to 
(C.267) but this time using (C.257) at the end, we have 


|(@a ® @g)(C)| < ||%o4 @ Tox (C)|| < |le|lmin- 


As an application, analogously to (C.248), we show that: 


Proposition C.99. For any locally compact Hausdorff spaces X,Y and any cross- 
norm on Co(X) @Co(Y), with completed tensor product Co(X)&Co(Y), we have 


Co(X)®Co(Y) = Co(X x Y), (C.271) 


under the isomorphism given by continuous extension of the map f ®gt> fg: 
(x,y) 4 f(x)g(y) from the algebraic tensor product Co(X ) ®Co(Y) to Co(X x Y). 
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Proof. We just prove the unital case, where X and Y are compact. 
Let x € X and y € Y, and take the corresponding evaluations maps ev, and ev, 
on C(X) and C(Y), respectively. These are multiplicative states, cf. Proposition 
C.19. Then ev, @ ev, is a nonzero multiplicative state on C(X) ®C(Y), and hence 
also on C(X)®C(Y), cf. Proposition C.98. This gives an injection of X x Y into 
X(C(X)®C(Y)), i.e., the Gelfand spectrum of C(X)®C(Y), cf. §C.2. 

Conversely, the restriction @, of any @ € L(C(X)@C(Y)) to C(X), given by 
1 (f) = @(f ® ly), is multiplicative, as is the restriction @ of @ to C(Y), de- 
fined by @o(g) = @(1lx ®g). Then @ = @ © @, with ensuing injective map 
X(C(X)@C(Y)) + X x Y. Thus the above injection is also a surjection, and hence 
a bijection, which is easily seen to be a homeomorphism. 


This can also be proved without Proposition C.98, using only the second step: if 
Z(C(X)@C(Y)) 4X x, then, since Y(C(X)®C(Y)) is closed in X x Y, there are 
nonempty opens U c X and V CY such that (U x V)NZ(C(X)@C(Y)) = 0. Now 
take nonzero functions f € C.(U) and g € C,(V) such that o(f ® g) = 0 for all 
@ € X(C(X)®C(Y)). This contradicts the isometry (C.18) of the Gelfand transform. 


Proposition C.100. For any locally compact Hausdorff space X and any C*-algebra 
B, let Co(X,B) be the C*-algebra of all continuous functions f : X —+ B for which 
the function x + |\f (x)||p is in Co(X), equipped with the supremum norm 


I7ll = supt|F@)Ila.x € X}. (C.272) 
For any C*-norm with ensuing tensor product ®, one then has 
Co(X)®B = Co(X,B), (C.273) 
under continuous extension of the map from Co(X ) ® B to Co(X,B) defined by 
fObH (fb: x f(x)b). (C.274) 


We just prove this for the minimal (i.e. spatial) C*-norm; the general case follows 
from nuclearity of Co(X), cf. Proposition C.101 below. 


Proof. Take some injective representation 2g : B  B(Hg), and represent Co(X, B) 
on ¢?(X )®Hg by linear extension of  : Co(X,B) — B(¢?(X)®H,g), as defined by 


t(f)d: 2M = 6 ® ma(f(X))Q, (C.275) 


where f € Co(X,B), x € X, and @ € Hg; this operator is easily seen to be bounded. 
In particular, an element fb € Co(X,B), as in (C.274), is represented by 


1( fb) (5; @ P) = f(x) 5: @ Mp(b)@. (C.276) 


Denoting the representation of Cy(X) on ¢?(X) through multiplication operators by 
Tons i€., Tin f) W(x) = f(x) w(x), where f € Co(X) and y € (?(X), we then have 


Tm @ Tp(f @b) = x(fb). (C.277) 
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In this way, Co(X) ® B is faithfully represented as a subalgebra of 


and so the final step is merely to show that Co(X)®B is dense in Co(X,B). Indeed, 
taking X compact for simplicity (otherwise one needs a further approximation argu- 
ment), for given f € C(X,B) and € > 0, define a cover Y = (Uy) xcx of X by 


Ux = {y €X | |F(~) -—FO)I| < e}. (C.279) 


Since X is compact, Y has a finite subcover {U,,,...,U,,}, with associated parti- 
tion of unity {g,,,..-, 8x, }» i.€., one has gy, € C.(U; ny with 0 < g,, < 1, and 


Ysa }=1 (eX). (C.280) 


Define an approximant g € C(X) ®B by 


x) = 8x, ® F(a), (C.281) 


whose image % € C(X, B) is given by 8(x) = Yj gx, (x) f (xi). Then for each x € X, 


lx) — F() Ila = — f(x) < Last) =e,  (C.282) 


B 


so that, taking sup,, we have ||%— || < €. This proves the claim. 


Since Co(X x Y) &Co(X,Co(Y)) under the map f +> f with f(x,y) = (f(x))(y), the 
isomorphism (C.271) is a special case of (C.273). 

Another case where the choice of a cross-norm does not matter—this time be- 
cause no completion is even needed—is the following. Recall Corollary C.28. 


Proposition C.101. Let A be a finite-dimensional C*-algebra. Then for any C*- 
algebra B, A® B is complete in any C*-norm, and hence all C*-norms coincide. 


Thus A&B =A ®B, though one still needs a norm on A B to make it a C*-algebra! 


Proof. In view of Theorem C.163, we only need to prove this for A = M,(C),n EN. 
As in the previous proof, we use the spatial tensor product on M,,(C) @B, so let us 
faithfully represent M,,(C) and B on C” and Hg, respectively, and form the Hilbert 
space C"®@Hg = C” ® Hg, carrying the representation id © mg of M,(C) @B, and 
hence of the (alleged) completion M,(C)®minB. Let 


n 
c= Y ejb cM, (C)@B, (C.283) 
ij=l 


where (e;;) is the standard basis of M,(C) and b/ € B. For any such c, we have 
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D 
Y ejb (42 @) 


ij=l 


llellmin = 


=) |b" oll,» (C.284) 
C"@Hp i 


where (U1,...V,) is the standard basis of C”, k = 1,..., is fixed, and g € Hg isa 
unit vector. Taking the supremum over @ gives 


> |b" Iz, (C.285) 


min 


Yeijb! 
ij 


for each fixed pair (i, 7). Hence any Cauchy sequence (c,) in M,(C) @ B takes the 
form cx = Y; j= e;jbj/, where each (b;’) is a Cauchy sequence in B for fixed (i, j). 
Then, using the fact that |le;;||y,,(c) = 1, we have 


Leib! — bY) 


< ¥ lo! of lle, (C.286) 
i,j i 


min 'J=1 


llc — cx|| min = 


for any c € M,(C) @B, as in (C.283). Taking c such that b'/ = lim, b,, it follows 
that c, > cin ||-||min, i-e., in M,(C)®minB. In particular, the limit c of any Cauchy 
sequence in M,,(C) @B with respect to the norm |] - ||min lies in M,(C) ® B, which 
is therefore complete already and is a C*-algebra in the spatial norm. Since the 
norm in a C*-algebra is unique (cf. Corollary C.28), it follows that any C*-norm on 
M,,(C) ® B must coincide with the spatial one || - ||min. 


It is also easy to show that 
M,(C)@B&M,(B), (C.287) 


i.e., the n x n-matrices with entries in B, with obvious operations and norm given by 
faithfully representing B on some Hilbert space Hg, as above, and then letting M,,(B) 
act on Hy = Hg ®---D Hg (i.e., n copies) in the natural way. A specific isomorphism 
M,(C) @B — M,(B) is then given by sending )7 e;;b'/ to the matrix (b‘/). 


Finally, one of the highlights of the theory of tensor products on A & B is a concept 
that apparently makes the entire theory superfluous: 


Definition C.102. A C*-algebra A is called nuclear if for any C*-algebra B, the 
norms ||- ||min and || - || max (and consequently all C*-norms) on A ® B coincide. 


The class of nuclear C*-algebras is large but not exhaustive: if H is infinite- 
dimensional, then Bo(H) is nuclear but B(H) is not, even if H is separable. However: 


Any commutative C*-algebra is nuclear (this underpins Proposition C.100). 
Any finite-dimensional C*-algebra is nuclear (cf. Proposition C.101). 

The (unique!) tensor product of any two nuclear C*-algebras is nuclear. 
Inductive limits of nuclear C*-algebras are nuclear (see §C.14). 

If0 —~1—A-—B-— 0isa short exact sequence (i.e., if J C A is an ideal in A and 
B =A/T) in which two of the three C*-algebras are nuclear, the so is the third. 
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C.14 Inductive limits and infinite tensor products of C*-algebras 


In the main text we deal with infinite quantum systems, albeit as idealizations rather 
than physical systems that exist in reality. Mathematically, such systems arise as 
infinite tensor products of C*-algebras, which in turn are special cases of inductive 
limits, also called direct limits (categorically, these are colimits, see §E.1 below, and 
as such they are unique op to isomorphism—in this case, of C*-algebras). 

Let J be a directed set (cf. Definition D.1), typically J = N with the usual order. 
Let (A;) a family of C*-algebras indexed by J; in case that J = N, these will often be 


An = B" = &",,B, (C.288) 


where B is some C*-algebra and ®max is the projective tensor product, extended 
from two C*-algebras (as discussed in the previous section) to any finite number of 
C*-algebras in the obvious way: for any completed C*-tensor product &, n € N, and 
C*-algebras (C),...,C,), we inductively define the tensor product of the latter as 


C1O-- OC, = (CLO OCy-1)OCo. (C.289) 


In general, the cartesian product [];<;A; consists of all functions a: J —> U;A; such 
that a(i) =a; € A;; we often write such functions as (a;);, where a; € A;. The Axiom 
of Choice then guarantees (or, following Russell, even states) that—provided none 
of the A; is empty—the set [];<;Aj is non-empty. Since each A; is a *-algebra, we 
can turn [];<,A; into a *-algebra in the obvious way, i.e., by defining scalar multipli- 
cation as (A -a)(i) = Aa(i), with pointwise addition, multiplication, and involution. 
This *-algebra, denoted by @;Aj, is the algebraic direct sum of the Aj. 

What about the norm? There are various options here, each relying on the choice 
of some subspace of 6;A;. For example, if Ao consists of all a € []j<, Ai for which 
lim; ||a;|| = 0, then the algebraic direct sum 6;A; of the A; is Ao, with norm 


Ila|| = sup llai|)- (C.290) 


For the inductive limit we need additional structure, namely a family of homo- 
morphisms @;; : A; —-> A;, defined for each i < j in J, such that for eachi < j <k, 


Qii = ida;; (C.291) 
Pik ° ij = Pik- (C.292) 


Such maps turn the family (A;) into a so-called directed system of C*-algebras. 
For example, in case of (C.288), and assuming B has a unit 1, (otherwise there are 
analogous constructions based on projections), for n < m, define Q,» : B” + B” by 


Pnn(b) = b@ 1g @---@ 1p. (C.293) 


with m—n units 1g. This can be done also in the more general situation (C.289), 
where we assume each C; to be unital with unit 1;, and define 
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An = Q7-1Ci; (C.294) 
Pnm(c) = ¢® 1¢,,, +++ @1e,,- (C.295) 


As a matter of central importance to the theory of quantum spin systems. one 
may generalize this construction in allowing more general directed sets, whilst spe- 
cializing it in picking very specific C*-algebras C;. Let Z¢ C R@ be the standard 
lattice in spatial dimension d, and let J be the set of of all finite subsets A of Z4 (so 
one typically writes A instead of i). Furthermore, take some fixed Hilbert space H, 
assumed finite-dimensional for simplicity (this also suffices for most applications to 
quantum statistical mechanics), and for each A € J, define the cartesian product 


H* =|] as (C.296) 


xEA 


where H,. = H for each x. Thus elements y: A — H of HA are families (Wi)xca, 
where yw, € H. To define the tensor product 


Ay = ®xcaHx, (C.297) 


we generalize the procedure explained between (C.245) and (C.246) in the previous 
section. If dim(H,4) < c¢ and dim(Hg) < °%, the injection 


Hy, ® Hz L(y X He,C), (C.298) 


is an isomorphism, and we use this fact (with Hy = Hg = H) to define Hy, as 
L(H4,C), that is, the set of all anti-multi-linear maps ~ : H4 — C, equipped 
with pointwise operations turning it into a complex vector space. Each element 
w:A —H of H% itself defines such a map W € L(H“,C) via 


W(9) = [] (Ge. Vea, (C.299) 


xXEA 


through which the inner product on Hy, is defined by linear extension of 


(WOH, = [] (Ou. (C.300) 


xEA 


In this realization of H,, the elementary tensors @,<c, Wy € Ha coincide with the 
above elements f € L(H4,C) = Hy. Furthermore, if (v;,...,V,) is a basis of H = 
C", then (@,c4 Vs(x)) is a basis of Ha, where s: A — {1,...,n}. Hence 


dim(H, ) = dim(H)'“). (C.301) 


Furthermore, writing n = {1,2,...,n}, and letting n“ be the set of maps (“classical 
spin configurations”) s : A — n, there is a natural unitary isomorphism 


Hy, = @(n*). (C.302) 
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Indeed, as the functions 6, :1+4 5, form a basis of @, the map 6,4 @,ea Vs(x) 

extends to a unitary from (7(n“) to H,. Under this equivalence, elements of Hy, 

may be interpreted as “wave-functions” whose argument is a spin configuration. 
Returning to C*-algebras, having defined H,, we now put 


Aa = B(Ha). (C.303) 


To fit this into the above framework, we note that the partial order < on J is given 
by A <A’ whenever A C A’, in which case there is a canonical embedding 


inetd Age (C.304) 


This embedding is given as in (C.293), i.e., by adding unit operators. Let A c A’ 
and define A” = A‘\A. We may split y’ : A’ > Has yw’ +> (Wins Wian)s from which 


H’ 2H x HA", (C.305) 
As in (C.298), this gives isomorphisms 
Hy =L(H" ,C) 2 L(H4 x H4" C) SL(H4 @H4" ,C) YH, @Hyn. (C.306) 
This, in turn, induces an isomorphism 
Ay = B(Ha') © B(A, © Han) = B(H,) @ B(Har) =Ay @Ag", (C.307) 
which, through the embedding 


B(H,) > B(Ha) ® B(Ag"); (C.308) 
a+ a®gcH,n)s (C.309) 


gives an embedding B(H,) — B(H,,). This, then, is the injection (C.304). 

Alternatively, B(H, ) may be constructed just like Hy itself, i.e., by starting with 
the set B(H)‘ of functions a: A — B(H). Any such a defines an operator @ on H, 
by first defining its action on elementary tensors by GW = @,c,d,Wy, and extending 
the result linearly to arbitrary vectors in H,. We write @ = ®c,adx, and reconstruct 
B(H,) as the complex vector space spanned by all such elementary operators. The 
injection (C.304) is given by linear extension of the map G++ a’, where d’, = a, 
whenever x’ =x € A C A’, and a’, = 1y otherwise, ie., if x’ ¢ A”. 

Either way, we obtain a directed system of C*-algebras (A, ), where the finite 
subsets A C Z4 are partially ordered by inclusion, and the maps @,4q’ : A, — Aq), 
with properties like (C.291) - (C.292), are given by the inclusions (C.304). 

There is a classical counterpart to this construction, in which the local C*- 
algebras are given by “functions of functions”, i.e., 


AY = C(w*) =C(C(A,n)). (C310) 
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Since n’ is a finite discrete set, any function on it is continuous (and lies in L, etc.). 
/ ri . . 
If A CA’, then, s’ € n* being a map s’ : A’ + n, the connecting homomorphisms 


1), : A) +A), (C311) 
are given quite canonically by 
Uh (f) es! F(s(4). (C.312) 


Note that C(n“) = ¢7(n“) as vector spaces, so that (C.311) also gives natural maps 
P(n4)  (n), and hence, via (C.302), Ha, Hay. These are given by linear 
extension of the map given on basis vectors by @ye, Vs(x) > Ys:s1, =s @V/EA! Vy! (x!)- 


Furthermore, analogously tot (C.307), since A’ = A UA" is finite, we have 


AY) = C(n’) = C(C(A',n)) = C(C(AUA"n)) = C(C(A, 2) x C(A",n)) 


~ C(C(A,n)) @C(C(A",n)) =C(n*)@C(x"") =A @AG. (C313) 


Given a directed system of C*-algebras (Aj, 9; as we define the local part Ajo, of 
TI; Ai as the set of all elements a = (a;) of [];A; for which there is ig € J (depending 
on a) such that a; = Pipi (ig) whenever ig < i. This is equivalent to the seemingly 
stronger condition that a; = @;;(a;) whenever ip <i < j, since 


aj = Pip j (Gig) = Qij ° Pipi (Gig) = ij (ai). (C.314) 


In the example (C.288) with (C.293), this simply means that for each sequence 
(Gn) nen; there is no € N such that ay = ay, ®"-" 1g for each n > no. Similarly, in 
the example (C.303) with (C.304), for each a = (a,), where A is a finite subset 
of Z4 and a, € A, for each A, there is a finite subset Ag C Z@ such that for any 
A 2 Ao we have a, = tAya (aay). It is easy to see that Ajoc is a *-algebra under the 
(pointwise) operations inherited from [],A;. For each (a;) € Atoc, the norms ||q;|| 
form a net in R*. Recall that some net (t;);¢7 in R (which by definition is indexed 
by a directed set /) is said to converge to t € R if for each € > 0, there is i € J such 
that |t —1;| < € for all j > i (since R is Hausdorff, any net in R converges to at most 
one point). Because the connecting maps @;; are homomorphisms of C*-algebras, 
they are norm-decreasing (cf. Theorem C.62.1), i-e., || @;;(ai)|| < ||ai||. Thus for any 
a € Aloe with associated ig € J, the (sub)net (|{a;||);>j) lies in the interval (0, ||aj, ||], 
and is monotone decreasing in the sense that if j > i > ig, then ||a;|| < ||a;||. As for 
sequences (which are just nets indexed by J = R), bounded monotone decreasing (or 
increasing) nets in R converge, so that net (||a;||);>i, has a limit, and this also means 
that (||q;||); has the same limit. Call this limit ||a||o. The map a +> |la||o generally 
fails to define a norm On Ajoc, since it may lack the property of positive definiteness, 
and even if it had it, the space would not be complete (at least if J is infinite, as we 
tacitly assume). We do have the C*-axioms ||ab||o < |lallo||b||o and |la*a||o = ||a||3 
though, since these hold for each norm aq; +> ||a;|| and are preserved in the limit. 
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So we say that ||a||9 is a C*-seminorm on Ajoc, and there is a canonical procedure 
to turn a *-algebra with C*-seminorm into a C*-algebra: 


1. Define the null space N C Ajoc for || - |p by N = {a € Atoc : |la||o = 0}; 
2. Define a norm on the quotient Ajoc/N by ||a+N|| = |la||o, and complete the 
quotient in this norm. The result is a C*-algebra 


A = lim,A;, (C.315) 


called the inductive limit of the directed system (Aj, @i;). 


For each i € J, we now define a canonical homomorphism @; : A; — A. If a; € Ai, put 
aj = Qij(ai) € A; if j >i, and a; = 0 otherwise. This gives an element a € Ajo, whose 
image in Ajg-/N C A is @;(a). A computation shows that if i < j, then @j° @ij = Gi. 


Using this fact, it follows that if we put A; = 9;(A;) C A, then A; C A; whenever 
i< j, and hence A may be rewritten as the norm-closure of the union of the Aj, ie., 


er eae (C.316) 


In the simple situation where the maps @;; are inclusions and hence isometries, as 
in our examples, we have N = {0}, so that A; = A;, and hence (C.316) simplifies to 


A Oe (C.317) 


As a case in point, define (An,@nm) as in (C.294) - (C.295). The infinite tensor 
product of the C; is then defined through (C.315) and (C.295), i.e., by definition, 


a - ae lll 
O1C; = lim, 1G =VUeiG - (C.318) 


n 


Here the first equation is general, and in the second it is understood that for any 
m >n, we have ®;_,C; C ®;_.,C; through the embeddings (C.295). 

More generally, let (Ay),cx be a family of unital C*-algebras indexed by an ar- 
bitrary set X, and let J = Y,(X) the set of finite subsets of X, partially ordered by 
inclusion. For any F € J, we have a tensor product 


Ar = &ycrAx, (C.319) 


where once again ® is an arbitrary completed C*-tensor product. An explicit con- 
struction of this tensor product along the lines of (C.289) requires an ordering of 
F,, but two such orderings give canonically isomorphic C*-algebras; if F C G, one 
should order G compatibly with F for the connecting homomorphisms @fG to be 
well defined by (C.295). This gives a directed system of C*-algebras (Ar, Qa), 
whose inductive limit defines the tensor product over A, i.e., 


®xexAx = limp @xerAx. (C.320) 
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As a special case, we may rewrite our earlier algebras A, and Aw) as 


Aa = @xcaB(H); (C.321) 
A) = @reaC(n) &C (1 ” (C.322) 
xEA 
cf. (C.313). Hence we have 
lim, A, = @,e7aB(H); (C.323) 
lim, A) = @,eqaC(n) &C ( ll ” (C.324) 
xeZd 


where in the last expression the infinite product [],<7a is endowed with the prod- 

uct topology, so that (by Tychonoff’s Theorem) the space in question is compact. 

Thus the ensuing inductive limit may directly be expressed as the standard commu- 

tative C*-algebra C(X), where X = J],<zan is compact, equipped with pointwise 

operations and the sup-norm. If n = 2 and d = 1, this is a model of the Cantor set. 
The homomorphisms @; enable us to state the universal character of A: 


Theorem C.103. Let (Aj, Q;;) a directed system of C*-algebras with inductive limit 

A. For any C*-algebra B endowed with a family homomorphisms B; : A; — B such 

that Bj ° (i; = Bj, there is a unique homomorphism B : A > B such that Bj = B o Qj. 
In other words, the following diagram commutes: 


(C.325) 
B 


Proof. This is true almost by construction, or rather by (C.316): since B is supposed 
to be a homomorphism of C*-algebras, it is continuous, so it is determined by its 
values on the dense subalgebra U;Ai, and hence by its values on each A;. But these 
values are necessarily given by B(@;(a;)) = B;(a;), where a; € Aj. 


Corollary C.104. Let (A,)xex be a family of mutually commuting unital C*-subalge- 
bras of a unital C*-algebra B (sharing the unit of B), such that the C*-algebra gen- 
erated by all subalgebras A, within B is equal to B. Also, let ® be some completed 
C*-tensor product such that for each finite subset F = {x1,...,Xn} CX, there is an 
injective homomorphism Qf : Ay — B (where Ap = Ay, +++ @Ay,) Satisfying 


Or (a1 @-+*@an) =ay---An (ay C Ay,,.--,dn € Ax, ). (C.326) 
Then B= QyexAy. 


Proof. In Theorem C.103, take A; ~» Ar and Bj ~» @r. In view of (C.320), this gives 
a homomorphism B : ®ycxA, — B. Here, this map is an isomorphism. 
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Finally, we give a result on infinite tensor products of states, needed in 88.4. 


Proposition C.105. Let (C;) jc be unital C*-algebras, and define their infinite (pro- 
jective) tensor product QC; as in (C.318). For eachi€ N, let @; be a state on C;. 
Then there is a unique state ®;_;@; on ®,_C; such that for each N € N and c; € Ci, 


A CO 


®i=-1Oi(Pn(c1 @-+- @en)) = [] @i(ci). (C.327) 
n=1 


Aco ; : 
Moreover, &;_, is pure iff each @; is pure. 


Proof, We write C" = &;_,C;, and similarly ®;_,@; = @", also for n = ©. 

Eq. (C.327) defines @” on a dense subset Uncn@,(C”) of C*, which proves 
uniqueness. Existence comes from Proposition C.98, according to which the map 
C1 @++-@en [IL @;(cj) extends to a state @?@/ on C”, which in turn defines a 
state @” on @,(C") CC®. Since (@7@;)\cm = &/"@} whenever m <n, one also has 
Oo, (cn) = @”, so that we may define a functional @” on U,@,(C") by its restric- 
tions @¢n = ". Since @” is a state and hence satisfies ||” || = @"(1o,(cr)) = 1, so 
does @* (on its dense domain). Since the continuous extension of w” to C™ has the 
same norm, this extension (still called @”) is a state by Proposition C.5. 

One direction of the second claim is trivial: if at least one of the @; fails to be 
pure, then @” inherits its convex decomposition so to speak, so contrapositively 
we obtain that purity of w” implies purity of each @;. We first prove the opposite 
direction for n < cc. Using Proposition C.91 and the fact that C” is a completion 
of the algebraic tensor product @/_,C;, the GNS-representation 7» (C”) is unitarily 
equivalent to the representation 1, ®---® Xo, On Hg, ®--:® Aw,, and 


(e@, +++ ®@ He, (C"))" = Me, (Ci)"B ++ B Me, (Cr). (C.328) 


Here, for any two von Neumann algebras A and B, A@B is the smallest von Neumann 
algebra containing the algebraic tensor product A ® B. The main lemma behind the 
second claim is the nontrivial commutation theorem for von Neumann algebras: 


(A@B)' = A'QB’, (C.329) 
which we state without proof. This iterates to n von Neumann algebras. Hence 
(Ke, ® +++ ® Ko, (C"))! = to, (C1) S--- SMe, (Cry, (C.330) 


so that the claim for n < - follows from Theorem C.90. 
Now take n = co, and assume each @; is pure. Suppose that for some t € (0,1), 


wo” =to'+(1—-1)o", (C.331) 


and restrict this equality to @,(C”). By the previous argument, the restriction of @” 
to ,(C"), which is just @”, is pure for any n € N. This gives Ol, (Cn) = Of, (Cn): 


This is true for each n, so that @! = w"’. Hence w” is pure. 
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C.15 Gelfand isomorphism and Fourier theory 


One of the most beautiful applications of Theorem C.8 is to commutative harmonic 
analysis. Let G be an abelian locally compact Hausdorff group (e.g., G = R, G=Z, 
or G=T). Such groups have an invariant Haar measure dx, which satisfies 


[| axtsflo) - | dxf") -| dx f (x), (C.332) 
G G G 
for any f € C.(G) and y € G, where 


LAiaiTos): (C.333) 


This measure is unique up to rescaling; if G is compact, it is normalized such that 
Ic dx = 1. For G=R, this recovers Lebesgue measure on R, whilst for Z and T, 


[axte) = L Fln: (C.334) 
Z neZ 

= =e do id 
[ dx f(x) = [= Fle®). (C.335) 


For f,g € C-(G), the convolution product f « g is defined by 


fx g(x) = a dy f(y)g(y_'x). (C.336) 


Using (C.332), it is easy to verify that this product is commutative and associative. 
Also, one may define an involution on C,(G) by 


f(x) =f@"). (C.337) 
We would now like to turn C,(G) into a commutative C*-algebra, but the obvious 


norms like the L?-ones do not accomplish this. Instead, for f € C.(G) we define an 
operator 7(f) on the Hilbert space L”(G) (defined with respect to Haar measure) by 


n(fyw= fey, (C.338) 


initially for y € C.(G). Equivalently, we may write 
n(f) = L dy f (y)Ly, (C.339) 


where we regard L, as an (obviously unitary) operator on L?(G), and the integral is 
most easily defined weakly, i.e., z(f) is the unique bounded operator for which 


(9, 2(f)V) = ie dy f(y)(@,LyW). (C.340) 
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Since Ly is unitary, this formula also shows that |(@,a(f)W)| < |[[liIIN@lll wll. 
where || f||1 = Jg4@x|f(x)|. Taking p = m(f)y gives ||7(f) ¥]| < |lflli|| wll, whence 


Iz P)Il SWF. (C.341) 


Hence 2(f) is bounded and extends from C,(G) to all of L?(G) by continuity. 
Lemma C.106. The map f + 1(f) from C,(G) to B(L7(G)) is injective and satisfies 


n(f*xg) = 1(f)X(g); (C.342) 
af") = a(f)*. (C.343) 


Proof. Eq. (C.342) follows from associativity of convolution, and (C.343) follows 
from the last equality in (C.332). To prove injectivity, we fix f € C.(G), pick € > 0, 
and find a neighbourhood U of e € G such that y-!x € U implies | f(y) — f(x)| <€. 
Then, using Urysohn’s Lemma, one may find a positive function wy € C.(U) such 
that , Wy = 1. Injectivity of 7 then immediately follows from the easy estimate 


ews) FO) f dylFO)-FO1-lwoO" 9] <e. 


Definition C.107. Let G be an abelian locally compact Hausdorff group. The group 
C*-algebra C*(G) is the norm closure of 1(C-(G)) in B(L7(G)), with norm 


If lle« = Iz(P)\leu2(a)): (C.344) 


Since 2(C.(G)) is a commutative *-algebra in B(L?(G)) by Lemma C.106, it is 
easy to see (from joint continuity of multiplication) that its norm closure C*(G) is a 
commutative C*-algebra, whose Gelfand spectrum we wish to compute. 

To this effect, we first define the dual group or character group G of G as 


G =Hom(G,T), (C.345) 


i.e., the set of continuous group homomorphisms from G to T, equipped with the 
compact-open topology. This topology is defined as the restriction to Hom(G, C) of 
the topology on C(G,C) generated by the neigbourhood basis of some y € G, ie., 


O(Y,K,€) = {9 € G: |y(x) — p(a)| < eVx € K}, (C.346) 


where K € .#(G) and € > 0. The corresponding notion of convergence is uniform 
convergence on each compact subset of G; in particular, if G is compact, this is just 
uniform convergence. Equipped with this topology, it can be shown that G is itself 
an abelian locally compact Hausdorff group under pointwise operations, i.e., 


(nh)(x) = nw) v(x): (C.347) 


¥ (x) = ¥@); (C.348) 


hence the ensuing unit é in G is the identity function é = 1g in Hom(G,T). 
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Proposition C.108. We have the following examples of dual groups: 


L=T, %(n)=2 (C.349) 
R&R, p(x) =e!?*; (C.350) 
TZ: pGyaZs (C.351) 
Zp = Zp, Ym ((nl) = err’? (C.352) 


Here Z, = Z/(p-Z) is the (finite) group of integers mod p. 


Proof. For (C.349), any character y: Z — T is determined by its value y(1) = z, 
since for n > 0 we have y(n) = y(1 +---+1) = y(1)” =z", where the sum has n 
terms; for n < 0, we obtain the same result from y(n) = y(—n)~! = (2-")7! = 2", 

To prove (C.350), we need to solve y(x + y) = y(x)y(y) with y(0) = 1, where 
y: R— T is continuous. To see that (C.350) gives all solutions, find € > 0 for which 
Jo dy Y(v) = a > O; this is possible, since y(0) = 1 and y is continuous. Then 


€ € E+x 
[ermr=farety=[ ar), — (€353 
x 
so that y is differentiable, with, writing y for dy/dx, 


aye) = ye +x) - 1x) = (ve) — Vy). (C.354) 


Hence (x) = cy(x) with c = (y(€) — 1)/a, so that y(x) = exp(cx). Since |y(x)| = 1, 
this forces c = ip for some p € R. This also implies (C.351), since T = R/Z and 
hence the characters of T are those characters of R that map Z to 1. Similarly, 
(C.352) follows from (C.349): the characters on Z that are trivial on p- Z take the 
form y(n) = z” for some p-roots of unity z = exp(2aim/p),m € {1,..., p}. 


Theorem C.109. Let G be an abelian locally compact Hausdorff group. Then the 
Gelfand spectrum L(C*(G)) is homeomorphic to G, and the Gelfand isomorphism 


C*(G) = Co(G) (C.355) 


is given on the dense subspace C,(G) C C*(G) by the generalized Fourier transform 


f= I dx V(x) f(x). (C.356) 


Thus the Fourier transform is a special case of the Gelfand transform (which is 
noteworthy if only because Gelfand himself promulgated the unity of mathematics). 


Proof. We will prove that each character y € G on G defines a character @y on 
C*(G) by continuous extension (i.e., from its dense subspace C,(G) to C*(G)) of 


oy(f) = f(y), (C.357) 


as in (C.356), and that the map Y++ @y gives a homeomorphism G Ss x (C*(G)). 
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It follows from simple computations that for f, g € C.(G), one has 


Oy (f *g) = @y(f) @,(g); (C.358) 
@,/(f") = @,(f). (C.359) 


To finish the proof, we need three further nontrivial facts about the map y+> @y;: 


1. Itis surjective, ie., if @ € E(C*(G)), then @ = @, for some 7 € G. 
2. It is injective, in that f(y) = f(7) for all f € C.(G) implies y = Y. Moreover, 
the character @y initially defined on C.(G) by (C.357) will be shown to satisfy 


|oy(f)| < If 


Thus @, : C.(G) — C may be extended to C*(G) by continuity in the usual way. 
3. The compact-open topology on G is mapped to the Gelfand topology on ¥(C*(G)). 


ee (C.360) 


To prove the first point, we restrict a character @ : C*(G) — C to C,(G) and note 
that because of the bound (C.341), this restriction in turn extends to an element of 
L'(G)*, which we still call w. Entry 10 in Table B.1 gives L'(X)* & L*(X), in the 
sense that any g € L'(X)* is given by f(g) = Jy fg for some f € L*(X). Hence 


a(f) = f dx@(a)s(x) (C.361) 
where @ € L”(G). The multiplicative property @(f * g) = @(f)@(g) then gives 
@(xy) = (x) @(y) (C.362) 


almost everywhere (a.e.) with respect to Haar measure. 
To prove continuity of ©, compare the following expressions with f,g € C.(G): 


These must coincide, so if we pick some f € C.(G) for which w(f) 4 0 (which is 
possible since C,(G) is dense in C*(G) and @ is not identically zero), then we obtain 


@(x) = O(L,f)/a(f), (C.363) 
almost everywhere. Hence we may redefine @ by (C.363) for all x € G. Since 
|o(Lxf) — O(Lyf)| < [af —Lyflle: $< llLaf —Lyfllt $ Clif —Lyflloo, (C364) 


recalling that f has compact support, it follows that the function x+> @(L,f) is 
continuous, whence also @ as redefined by (C.363) is continuous. 
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We now show that @(x) € T. If |@(x)| > 1, then @ cannot be bounded (whereas 
we know it lies in L*(G)), because @(x") = @(x)” by (C.362). But the same is true 
if |@(x)| < 1, because using @(x~!) = @(x)~! (which follows from (C.362) and 
(C.363), which gives @(e) = 1), the same argument applies with x~! instead of x. 
Thus @: G— T isa character ¥ € G (where the bar is conventional), so that (C.361) 


turns into (C.356). As to injectivity, if f() = f(7) for all f € C.(G), then 


[4x -7FO)F(e) =0. (C.365) 


for all such f, which by standard integration theory gives ¥ = y a.e. and hence 
everywhere, since both functions are continuous. To prove (C.360), we use a trick: 
take some fixed @ € £(C*(G)), so that wo(f) = f(y) for some y~ € G by the 
previous step of the proof, and ||@(f)|| < || f\|c« for all f € C*(G). For y € G and 
f € C,(G), eqs. (C.356) and (C.347) give @,(f) = @(Vof), where Vyof is the 
pointwise product of the three given functions from G to C. Hence 


|@y(F)| = |o(MF)] < lz) Fl = lof lle-- (C.366) 


We now denote Vy by 7’, which lies in G, and note that for any ¥ € G, we have 


(9.27 A)Y) = (79, ATVW)) (P.WEL(G),fEC(G)). (C367) 
Taking 9 = 2(y f)y and using Cauchy—Schwarz as well as ||7 || = ||@||, gives 


Ix(¥ A) vl < Iz()7 vl (WEL? (G),f €C.(G),7 €G). (C.368) 


Taking the sup over all y € L(G) with || y|| = 1 (which also means ||7/ y|| = 1) gives 
\|z(Y'f)|| < ||2(f)||. Combined with (C.366) and (C.360), this gives the bound 


|@y(f)| < || FIle. (C.369) 


We now prove continuity of the map @y — y from X(C*(G)) to G (using se- 
quences for simplicity, the argument for nets being similar). If @,, + @y, Le., 
tlm) > f(y) for each f € C*(G), and hence for each f € C.(G), then y, > Y 
uniformly on any K C .#(G). Writing 7, = *#¥ and g = fY, we first notice that 


lyn(x) — ¥(x)| = I%#(x) — 11 (C.370) 


nN 


F(m)—-FY) = &(%,) -— 8(a)- (C.371) 


This shows that we may reduce the proof to the case y = 1g; otherwise, simply 
change 7, to 7,. Thus we assume that f(y) — f(1g) for each f € C.(G). We now 
pick some fixed g € C.(G) such that (1g) = Jgdxg(x) = 1. For € > 0, by uniform 
continuity there is a neighbourhood U of the identity e € G such that, cf. (C.364), 


IILug —glli <€/3 (we U). (C.372) 
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Then UxegxU covers G, and hence also covers each compact set K C G. Therefore, 
K has a finite subcover UjcjxjU. Define g; = L,,g. By invariance of the Haar mea- 
sure, we have g;(1gG) = |, so that by definition of @,, —> @,, we may find N € N 
such that for each j € J and for all n > N, we have 


|8;(%) -—1| < €/3. (C.373) 


Also, if x € K, then x = x;u for some j € J and u € U. Eq. (C.372) then implies 


(mH) =] f dr Cueo)-e0)O)|<e/3. (C374 
Hence for any K € .#(G) and x € K as above, we may estimate, for all n > N, 


lYm(x) — 1] < |Yn(x) 1 — 8) (%))1 + 185%) (Mm) — 1) 
+ |8j(%) -—1| < €/3 +€/3+€/3 =e. (C.375) 


Consequently, f(y) > f (1g) for each f € C.(G) implies y, > 1g in G; as we have 
argued, this proves continuity of the bijection X(C*(G)) — G given by @, > 7. 

If 2(C*(G)) and G are compact (which is the case iff G is discrete, in which case 
C*(G) has aunit 6.) we are ready, since a continuous bijection from a compact space 
to a Hausdorff space has a continuous inverse, and hence is a homeomorphism (in 
our case, both spaces are compact as well as Hausdorff). In general, continuity of the 
map ++ @y from G to ¥(C*(G)) almost immediately follows from the definition of 
the compact-open topology on G: if y, — 7 in this topology (similarly for nets), and 
f €C.(G), then f(y) > f(y), and hence @,, (f) + @y(f). A simple € /3-argument 
then gives the same result for f € C*(G). 


Note that local compactness of G (though provable directly) also follows from this 
theorem, since we know this for the Gelfand spectrum 2 (C*(G)), cf. Theorem C.45. 

Beside the Gelfand isomorphism (C.355), in which the two function spaces 
C*(G) and Co(G) are of a different type, there exist more symmetric versions of 
the generalized Fourier transform (C.356). In the setting of Banach spaces (as op- 
posed to spaces of distributions, which would take us into the territory of locally 
convex topological vector spaces, and hence outside the scope of this appendix, 
though cf. 85.11), there are (at least) two natural possibilities. The traditional and 
most familiar one is provided by the Hilbert spaces L?(G) and L?(G), defined with 
respect to suitably normalized Haar measures dx (on G) and dy (on G), respectively. 
A second, more recent possibility is to use the following two Banach spaces. 


Definition C.110. The Banach space C}(G) is the completion of C.(G) in the norm 


II flo = max{|]f||.0, || Fl]-o}- (C.376) 


Similarly, the Banach space Ci(G) is the completion of C-(G) in the norm 


[|S llo = max {|g}, [I lho}. (C.377) 
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It follows that C5 (G) can be norm-decreasingly injected into both C*(G) and Co(G), 
so that Cj (G) is a subspace of Co(G) as well as of C*(G). By (C.341) and (C.360), 


L'(G)NCo(G) C CG), (C.378) 


and similarly for C;(G). Indeed, C%(G) (and likewise C;(G)) could equivalently 
have been defied as the completion of L'(G)MC(G) in the norm (C.376). 


Theorem C.111. The Fourier transform (C.356) induces isometric isomorphisms 


L(G) = L(G); (C.379) 
Co(G) = Co(G), (C.380) 


such that, on suitably normalizing dx and dy, the Fourier inversion formula 
£0) = far) (C381) 


cf, (C.356), in both cases holds verbatim whenever f € L!(G) and f € L'(G), in 
which case f and f are continuous, and (C.356) and (C.381) hold pointwise. 


The Fourier inversion formula (C.381) is actually equivalent to its special case 
fle) = far, (C382) 
where e € Gis the unit, since (C.381) follows by substituting L,-: f for f and using 


Loaf =f. (C.383) 


It is also important to realize that conceptually, the inversion formula (C.381) reads 


{=f Q-): (C.384) 


where the Fourier transform é for suitable € : G > C is defined, as in (C.356), by 
bn) = farxg(n. (C385) 


Here y : G—+ T is some character on G, i.e., x € G, and we have a natural map 


Got: (C.386) 
oe (C.387) 
(7) = (2). (C.388) 


Pontryagin duality states that (C.386) - (C.388) define an isomorphism, i.e., 


G=G. (C.389) 
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We omit the lengthy proof of this beautiful isomorphism of topological groups 
(cf. the examples in Proposition C.108), and turn to the proof of Theorem C.111. 


Proof. First, we (reconstruct a correctly normalized Haar measure on G by defining 


|, : C-(G,R) > R; (C.390) 
6 + inf{f(e) | f €O3(G),f > C (pointwise)}. (C.391) 


This map takes values in R, since if f is real, as required by f > C in (C.391), 
then, noting that the Gelfand (= Fourier) transform on Cj(G) maps the involution 
(C.337) into complex conjugation on Co(G), so is f(e), cf.(C.337). Furthermore, te 
is linear, as well as positive: if € > 0 (i.e., pointwise), then also f > 0 in Co(G), so 
that f > 0 in C*(G), because by Theorem C.109 the map f +> f is an isomorphism, 
which by Theorem C.52 preserves positivity. This gives (y,2(f)W)72(g) = 0 for 
all ye L(G), which by a simple continuity argument (in a proof by contradiction, 
using the inclusion Cj(G) C Co(G)) enforces f(e) > 0, and hence inf{f(e)} > 0. 


A 


By Theorem B.19, there is a measure dy on G defining the integral fg, i.e, 


[, dyC(y) =inf{f(e) | f €C3(G),f =o}, (C392) 


where initially ¢ is real-valued, upon which the integral is extended to C.(G) by 
complex linearity, as usual in (Lebesgue) integration. The point is that the measure 
dy is translation invariant and hence is a Haar measure on G: indeed, replacing g 
by Lyg amounts to replacing f (as a function that satisfies 7 > g) by ¥’f. Invari- 
ance then follows from 7/(e) = 1 for any character 7 € G, which obviously implies 
(y¥ f)(e) =¥ (e)f(e) = f(e). The Banach spaces L’?(G) and L?(G) are then defined 
with respect to dx on G (assumed given) and dy on G (as above), respectively. 

Furthermore, the proof uses an approximate unit (dy) of C*(G) that lies in C.(G) 
and is indexed by shrinking neighbourhoods U of e € G. More precisely, take the 
directed set of all symmetric neighbourhoods of e (i.e., U~! = U), ordered by re- 
verse inclusion >, take positive functions hy € C.(W) for some neighbourhood W 
of e satisfying W* C U, normalize hy such that Jghu *hy = 1, and define 


dy =hy *hiy: (C.393) 
fu =f «dy (f €C*(G)). (C.394) 


We will show that for each f € C*(G), we have 


lim || fu — flew =U. (C.395) 


To this end, we first show that ||dy||c: < 1, which follows from the estimate 


\z(5)vll=|[aréo0)2¥] < [-ard.0ylewll=ll. (396 
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Similar estimates give f « dy > f for f € C.(G), so that finally 


\| f * du — fllc« = ||f * bu — g * bu +g * Oy —g+e—f—f\le 
<2\|f — elles +|le* ou — alles. (C.397) 


Taking g € C,.(G), an €/3 argument finishes the proof of (C.395). Moreover, 
fu € Co(G) (f eC*(G)). (C.398) 
To prove this, take g,h € C.(G). Regarding g and h as elements of L(G), note that 
gxh(x) = (e",L,-1h) 1216); (C.399) 


so that Cauchy—Schwarz and unitarity of L,-1 give ||g* Allo. < ||g|l2||A|/2. Applying 
this with g ~ m(f)g and h ~» h, where f € C*(G), g € C.(G), and h € C,(G), yields 


IF *e%hlle < [Ix(P)allallall2 < Iiflle«ligllallal2s (C.400) 
If *gxhlls = [la (f)(@*h)ll2 < Iiflleellg «Alle. (C.401) 


Eq. (C.401) will be applied later, in the proof of (C.379). Eq. (C.400) shows that if 
fn > f in C*(G) for some net (f;,) in C.(G), then f, *g*h > fxg *h uniformly, so 
that f *g*h € Co(G) and f, *g*h— f *g*h in Co(G). Also, 


IF # go ¥ holon = [F8hlc0 S [Flee Bhlleo = IL llc*llBhlle, (C.402) 


by isometry of the Gelfand transform, so that also frxgeh > fugeh in Co(G). If 
fn, 8,h € C-(G) then f, *g*h € C.(G) C Ch(G), and the above computations give 
fax g*h— f*g*h in Cj(G). This shows that f * g*h € Cj(G); taking g = hy and 
h=hj yields (C.398). 

We now turn to the Fourier inversion formula (C.381). Since the Gelfand trans- 
form C*(G) — Co(G) is an isomorphism, for any € € Co(G), we can find f € C*(G) 
such that f = €, and we can find a net fy = f * Sy in Ci(G) such that 


lim [fu — flle« = lim [fur — Fllegié) = Hill fu — flee = 0. (C403) 


If € € C.(G), we in addition have fy + Cin Co(6), or, equivalently, 


lim ||fu — Plle+(@) = 0 (C.404) 


Eq. (C.403) and the fact that by is continuous, and hence uniformly continuous on 
every compact K C G (which we take such that it contains the support of f = 6), 
gives limy ||6y — y& = 0, where ||7) hs is the supremum of |7n(y)| over all ye K. 
For f € C.(G), with fy = dy f, this gives fy > f in L(G). As we trivially have 
Fal c(6) S Wl (é) (and similarly, of course, on G itself), we obtain (C.404), which 
together with (C.403) also yields fy — € in C4(G). 
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Since fy € C(G), the infimum in (C.392) is saturated, and hence the Fourier 
inversion formula (C.381) holds for fy. Pontryagin duality then yields isometry, 
ie., ||,fullo = ||,fullo. Convergence of fy in Ci(G) therefore yields convergence of 
fu in C5(G), necessarily to f, since we already knew that fy — f in C*(G), cf. 
(C.395). This shows that f € Cp(G), so that (C.381) holds for f, implying 


IF llo = Ilfllo. (C.405) 


Thus the Fourier transform ¥ : C*(G) —> Cy(G) from Theorem C.109 is given by 
continuous extension of .(f) = f as defined by (C.356), where f € C,(G). 

To prove (C.380), let B(G) be the set of all f € Ci(G) for which f € C.(G), and 
let B(G)~ be its closure in C}(G). Then ¥ restricts to an isometric isomorphism 
B(G) > C.(G), and hence also to an isometric isomorphism B(G)~ — Cp (G); we 
recall that (by definition) C5(G) is the completion of C.(G) in its norm || - |o. 

Repeating this construction for G instead of G, and using Pontryagin duality 


(C.389) with the ensuing isomorphisms C*(G) & C*(G) etc., we also have a Fourier 
transform ¥ : C*(G) — Co(G). Since the Fourier inversion formula (C.381) holds 
on C.(G), we see that .¥ maps C.(G) isometrically to B(G) and hence by continuity 
maps C3 (G) to B(G)~. At the same time, .¥ maps B(G) (defined, mutatis mutandis, 
like B(G)) to C.(G), and hence maps B(G)~ to C}(G). Since B(G)~ C C%(G), this 
implies B(G)~ = C3(G) and B(G)~ = Ci (G). This proves (C.380). 

Returning to (C.381), we know from the above analysis that (C.356) and (C.381) 
hold if f € C3(G) and f € C.(G). If f € L!(G), then, by Lebesgue anieeenoe theory, 
eq. ¥ (f) remains given by (C.356). If also f € L(G), then f € L'(G)NCo(G) and 
hence f € C}(G), cf. (C.378). By (C.380), there exists f € Ch(G) such that f = f 
in C*(G), and hence for a.e. x € G (with respect to Haar measure), we have 


f(x) = lim f + 8y (x) = lim f + Sy (x) = 


Shi 


(x). (C.406) 


It follows that f = f a.e., and so the inversion formula (C.382), and hence (C.381), 
holds, provided (if necessary) f is replaced by its representative f. 

Finally, to prove (C.379), take f = yw in (C.356) in C,(G), so that we may com- 
pute 


wid = [axiva@yP=wew'e)= Larww ny = [ariv? = 103. 

(C.407) 

We may therefore extend F, initially given by ¥(f) = f, from C,(G) to its com- 
petion L7(G) in || - ||2. Second, we prove surjectivity similarly to the previous part: 
Pick € € C.(G), and hence f €C*(G) with f =6. Then fy = f *dy € L(G), 

as follows from (C.401). Then fy — f in L(G ), since analogously to the previous 
proof, we find that (fy) is a Cauchy net in L?(G). By isometry of .F (as just proved), 

this implies that (fy) is a Cauchy net in L?(G). Let fy — g in L*(G); continuity of 

F gives F(g) = C, making F surjective at least onto C.(G). Since L?(G) is the 

completion of C,(G) in the L?-norm || - ||2, the Fourier transform .¥ : L?(G) > L?(G) 

is an isometric surjection, and hence is unitary. 
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We close this section with the SNAG-Theorem (named after Stone, whose Theo- 
rem 5.73 it generalizes, Naimark, Ambrose, and Godement, each of who published 
versions of it in 1944). This theorem uses projection-valued measures, which we 
have avoided so far, but which are appropriate here as well as in our application of 
the SNAG-Theorem to the Goldstone Theorem 10.28. Recall that the Riesz—Radon 
representation theorems B.19 and B.24 establish a bijective correspondence between 
states on Co(X) and probability measures on X. There is a similar correspondence 
between representations of Co(X) and projection-valued measures on X. Cf. §B.4. 


Definition C.112. Let X be a set with o-algebra © C Y(X), and H a Hilbert space. 
A projection-valued measure for (X,2,H) is a map e: £ — Y(H) such that for 
each unit vector w € H, the map e\Y) : © — [0,1] defined by 


e\Y)(A) = (w,e(A)y), (C.408) 


is a probability measure. Equivalently, e(0) = On, e(X) = 1y, e(ANB) =e(A)e(B), 
and e(UnAn) = ¥, e(An) for pairwise disjoint A, in the strong topology on B(H). 
The simplest example must be H = L?(X,¥,) with e(A) = 1a, cf. §B.6. 

As in (B.328), one can integrate any bounded measurable function f : X + C 
“against” e, i.e., there is a unique operator fy de f such that for any € > 0 there is a 
finite partition X = |_|?_, A; of X into n Borel sets Aj, such that for any x; € Aj, 


<E. (C.409) 


| [aet-¥ feet 
i=l 


Analogously to the Riesz—Radon representation theorem, one may then prove: 


Theorem C.113. Let X be a locally compact Hausdorff space. There is a bijective 
correspondence between non-degenerate representations 1 : Co(X) — B(H) and 
projection-valued measures e for (X ,&,H) (where Z is the Borel o-algebra), viz. 


mf)= [ def: (C.410) 
e(A) = ala), (C.411) 


where 1(14) is defined by extending a from Co(X) to the C*-algebra A(X) of 
bounded Borel functions on X (cf. Theorem B.102 and Proposition B.98). 

We finally need the existence of a bijective correspondence between continuous 
unitary representations u of G and non-degenerate representations of C*(G) given 
by (C.506) in §C.18 below; see the comment below Definition C.119. Combined 
with Theorems C.109 and C.113, we then obtain the SNAG-Theorem: 


Theorem C.114. There is a bijective correspondence between continuous unitary 
representations u of a locally compact abelian group G on some Hilbert space H 
and projection-valued measures e: B(G) — Y(H) on the dual group G, such that 


u(x) = [de 10). (C.412) 
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C.16 Intermezzo: Lie groupoids 


Groupoids generalize groups, group actions, and equivalence relations. As such, 
they provide a more flexible language for dealing with symmetries than either of 
these. Like Lie groups, one also has Lie groupoids, which form an important tool in 
constructing continuous bundles of C*-algebras (see §C.19 below). These, in turn, 
provide the mathematical foundation of (deformation) quantization, see Chapter 7. 


Definition C.115. A groupoid G = (G,Go,5,t,i,1) is a small category (i.e. a cat- 
egory in which the underlying classes are sets, cf. §8E.1) in which each arrow is 
invertible. Thus one has a set (of arrows) G, doubly fibered over some base space 
Go through source, and target maps s,t : G, —> Go. These maps define the set 


Gy = {(x,y) € Gi x Gi | s(x) =t0)} (C.413) 


of composable pairs, on which a multiplication m: G2 —> G is defined, which we 
simply denote by xy = m(x,y), subject to the axioms 


s(xy) = s(y); t(xy) =t(x) Gry € G2); (C.414) 
(xy)z = x(yz) (xy € Go, yz € Go), (C.415) 


the third being well defined by virtue of the first and the second. 
Furthermore, there is an object inclusion map i: Go G1, u+> id,, satisfying 


s(id,) = t(id,) =u (u € Go); (C.416) 
Xidy(.) = idyyyx =x (x € G1). (C.417) 
Finally, what makes a (small) category a groupoid is the existence of an inverse 


I:G,-G, xrex*, 


Satisfying 
s(x7!) =t(x); t(x7!) =s(x) (x € G); (C.418) 
x x = idyy; xx) =idyy (x € G1). (C.419) 


A Lie groupoid is a groupoid for which G, and Go are manifolds, s and t are 
surjective submersions, and multiplication and inversion are smooth. 


We often identify wu with id,, so that x~!x = s(x), etc. We allow manifolds with 
boundary, which provide key examples; cf. Proposition C.117 below. 


Proposition C.116. In a Lie groupoid, object inclusion is an immersion, inversion 
is a diffeomorphism, G2 is a closed submanifold of G, x G\, and for each u € Go, 
the fibers s~'(u) and t~'(u) are submanifolds of G\. 


Abusing notation, G; is often called G. Some basic examples of Lie groupoids are: 
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e Lie groups G, where G; = G and Go = {e}, e € G being the unit. 

e Manifolds M, where G; = Go = M with the obvious trivial groupoid structure 
s(x) =t(x) =idy =x! =x, and xx =x, 

e Pair groupoids over a manifold Go = M, where G; = M x M and s(x,y) = y, 
t(x,y) =x, (x,y) = (y,x), (x,y) (,z) = (x,z), and idy = (x,x). 

e Smooth equivalence relations, i.c. immersed submanifolds of M x M. 

e Action groupoids I’ x M, which are defined by a smooth (left) action ID © M of 
a Lie group I’ on a manifold M, where G; = I x M, Go = M, s(g,m) = g=!m, 
t(g,m) =m, (g,m)~' = (g~',g~'m), and (g,m)(h,g~'m) = (gh,m). 

e Vector bundles 1: E — M over a manifold Go = M, with s =t¢ given by the 
bundle projection 7, object inclusion M <> E as the zero section, multiplication 
as as fiberwise addition of tangent vectors, and inverse €~! = —E. 


Any Lie groupoid G defines an associated tangent groupoid G’, which will play 
a crucial role in §C.19. We first explain the (surprising) underlying differential ge- 
ometry in three steps of increasing complexity. We start with the manifold M = R”, 
with tangent bundle TM = R*". Our goal is to describe a smooth structure on 


F=TMU(0,1])xMxM, (C.420) 
seen as a bundle over [0, 1], where (as the notation already indicates) the fibers are 


Fo = TM; (C.421) 
Fy =MxM (h>0). (C.422) 


Although each fiber F;, of this bundle is isomorphic to R2”, its smooth structure is 
not equal or even diffeomorphic to the usual one on [0, 1] x R”. Instead, we define 


@:[0,1])x TM > TML(0,1]xM x M; (C.423) 
0(0,6) = &; (C.424) 
o(n,E) = (h,exp™ (hE)) (h > 0), (C.425) 


where the symmetrized (“Weyl”) exponential map exp” : TM — M x M is given by 
exp” (x,v) = (x— 1y,x+ 4y). (C.426) 


Here the coordinates (x,v) of € € T,M denote € f(x) =Y;v! (3) (x) =Y;v'0;f (x). 


Like its more familiar counterpart (x,v) > (x,x-+v), exp™ is a diffeomorphism. 
For M = R”, our map @ is a bijection, with inverse given by 


@—!(x,v) = (0,x (C.427) 


Ears, 
| (h,x,y) = (nt nes *) (i > 0). (C.428) 


We use this to transfer the product topology (and also the smooth structure as a 
manifold with boundary) from [0,1] x TM to F. Then a sequence (fin,xn,Yn) in F, 
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where fi, —> 0, converges iff x, — x, y, > x for some x € M, and (yp —2Xn)/hin > Vv, 
in which case (f,,%n, Yn) —> (0,x,v). More abstractly, F has two key properties: 


1. The map F > [0,1] x M x M given by 


(x,v) ++ (0,x,x); (C.429) 
(h,x,y) + (A,x,y) (h> 0), (C.430) 


is smooth. Indeed, as a map [0,1] x TM — [0,1] x M x M, this map is given by 


(0,x,v) + (0,x,x); (C.431) 
(h,x,v) + (h,x— thv,yt jhy). (C.432) 


2. For any f € C*(M x M) that vanishes on the diagonal 
A(M) = {(x,x)|x€M}CMxM, (C.433) 
the function 6f on F defined by 


Of (x,v) = 6, f (x,x); (C.434) 
df (h,x,y) = f(x,y)/h (h> 0), (C.435) 


where the tangent vector €; € T(,,)(M x M) has components (—}v,}v), is 
smooth. Indeed, as a function on [0, 1] x TM, the pullback 6* f = 6 fo ¢@ is given 
by 


5* f(0,x,v) = oi f(x,); (C.436) 
5*f(h,x,v) = f(h,x—thv,y+ sav) /h, (C.437) 


which is smooth given our assumptions on /. 


A similar construction works for any (smooth) manifold M, except that the 
smooth structure on F may no longer be definable in terms of a single map @. In- 
stead, we invoke a special case of the well-known tubular neighbourhood theorem 
of Riemannian (or, more generally, affine) geometry, which states that M, identified 
with the zero section in its tangent bundle TM, has an open neighbourhood U such 
that the (symmetrized) exponential map exp” : U — M x M is a diffeomorphism 
onto its image. Here exp” (€) = (y(—!), y(4)), where € € T,M and is the unique 
affinely parametrized geodesic with y(0) = x and y(0) = €. We now replace the 
space [0,1] x TM used in the special case M = R” by the pair of spaces 


Vi = {(h,§) € [0,1] x TM | hE €U}; (C.438) 
V> = (0,1]xMxM, (C.439) 


with associated maps @; : Vi — F and 2 : V2 — F defined by 
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$1(0,6) =; (C.440) 
$1 (hE) = exp” (hE) (A > 0); (C.441) 
$2(h,x,y) = (x,y) (A > 0). (C.442) 


Then @; and @> are injective and, writing F; = 9;(V;), we have F = F, UF), which is 
far from a disjoint union; let Fj; = Fj Fj. Also, let Vij = {a@ € Vi | @;(@) € Fj}, with 
associated maps $j; = gj" 0 @; : Vi; + Vji. We now define the smooth structure of F 


by declaring f : F — R to be smooth iff fjo @, | : V; > Ris smooth, i= 1,2, where fj 
is the restriction of f to F;. These conditions are compatible on the averlan F;;, since 
12(h, € ) =exp™ (hE) is a diffeomorphism (with inverse @2;). This smooth structure 
may also be defined by imposing conditions | and 2 above, mutatis mutandis. In 
particular, (C.434) should now read 


8F(S) = Sif (%x)s Si = (— 35535) € Ty (MXM) =TMOT.M. (C443) 


A more general form of the above construction, which will be used to generate a 
vast class of continuous bundles of C*-algebras, is as follows. Let M be a closed 
submanifold of another manifold G (in the above situation we take G = M x M and 
identify M with A(M)), and replace TM above by the normal bundle 


NuG = TyG/TyM, (C.444) 


i.e., the quotient of the restriction TG of the tangent bundle TG to M C G by its 
subbundle TyM = TM; hence the fiber of NyG at x € M C Gis T,G/T,M. 
In the above case G = M x M, one therefore has 


Nu(M x M) = TM, (C.445) 


through the isomorphism [(&1,2)] +> 3(&2 — 1), where (€1, 2) € Tix.»)(M x M) 
and [(€1,&2)] is its equivalence class in the quotient T/, »)(M x M)/T(,,»)(A(M)). 
Other easy examples are Lie groups G, for which NyG = T.G = g is just the Lie 
algebra of G (at least as a vector space), and G = M, for which NyG = M. 
For the bundle F, defined over J = (0, 1], we take the fibers and total space as 


Fy = NuG; (C.446) 
F, = G (h> 0); (C.447) 
F = NyGU(0,1|xG. (C.448) 


Once again, there are two equivalent ways to define a smooth structure on F’. The 
first uses a more general version of the tubular neighbourhood theorem from differ- 
ential geometry, which states that MC NyG (seen as its zero section) has an open 
neighbourhood U that is diffeomorphic to some open neighbourhood U' of Mc G 
via a diffeomorphism @ that maps M to itself (i.e., pointwise). Then put 
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Vi = {(4,§) € [0,1] x NuG | hE € U}; (C.449) 
V> = (0,1]xG, (C.450) 


again with associated maps @, : V; + F and @2 : V2 — F, this time defined by 


$1(0,6) = §; (C.451) 
(4,6) = P(AG) (n> 0); (C.452) 
2(h,n) = (hn) (h > 0). (C.453) 


One then proceeds exactly as above. Equivalently, we impose that: 


1. The map F — [0,1] x G, defined at f = 0 by € + (0,x), where € € NyG (where 
x €M CG), and (h,n) +> (h,n) for h > 0 andn € G, is smooth. 
2. For each f € C*(G) that vanishes on M, the function 6 f on F defined by 


6f(§) = of; (C.454) 
5f(h,n) = f(n)/h (h> 0), (C.455) 


is smooth (note that & f is well defined despite the fact that § € TyG/TyM rather 
than € € TG, since any two representatives of € in TG differ by vectors in 
TM, which vanish on f because f|y = 0 by assumption). 


After this preparation, we are at last in a position to define tangent groupoids. 


Proposition C.117. Any Lie groupoid G over some base space Go = M defines an 
associated tangent groupoid G’, with total space G’ = F, cf. (C.448), with smooth 
structure as explained, base space Gh = [0,1] x M, source and target projections 


s™(§) =17(€) = (0,2(&)) (4 =0); (C456) 
s’ (h,x) = (h,s(x)) (A> 0); (C.457) 
t’ (hx) = (h,t(x)) (A> 0), (C.458) 


where 1: TyG/TyM — M is the bundle projection, and x € G, multiplication 


§-n=§+7n (h=0); (C.459) 
(f,x)-(h,y) = (h,xy) (a> 0), (C.460) 
and inverse 
—-' =—§ (h=0); (C.461) 
(h,x)~! = (h,x7!) (n> 0). (C.462) 


In other words, G7, seen as a bundle over [0,1] is a “bundle of groupoids”: the 
groupoid above f = 0 is the normal bundle 2 : NyG — M, as in the vector bundle 
example above, whereas the fibers above h > 0 are G itself. 
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C.17 C*-algebras associated to Lie groupoids 


One may associate two C*-algebras to a Lie groupoid G, called C}(G) and C*(G), 
which coincide for abelian Lie groups, and as such generalize the construction in 
8C.15, cf. (C.332) and (C.336) - (C.337). We first generalize the Haar measure. 


Definition C.118. A Haar system on a Lie groupoid G is a family of measures 
(u",u € Go), where uw" is defined on the t-fiber 


G" =t"!(u), (C.463) 


where it is locally equivalent to Lebesgue measure, and each function 
, +f du" f (f €C2(G)) (C.464) 
G"¥ 


on Go is smooth. A Haar system is \eft-invariant if for each f © C2(G) and x € G, 


Fog O10) = [dH 0) £09). (C465) 
Gis) Gs(x) 


It is sometimes convenient to regard 1” as a measure on all of G but having support 
in G". Either way, any Lie groupoid possesses a left-invariant Haar system, briefly 
called a left Haar system. For example, if G is a Lie group, u € Go can only be the 
identity e € G, so that a left-invariant Haar system is the same as a left-invariant 
Haar measure on G (which exists on any locally compact group). Furthermore: 


1. If G= Go =M, where M is a manifold (as always), then s(x) =t(x) =x=x7!, 


and the condition (C.465) is empty, so that a left-invariant Haar system is just a 
smooth function UW : M — (0,°°), i.e. u(u) = pw". In what follows, we simply take 
L(u) = 1 for each u € M. More generally, whenever G“ is compact, we normalize 
a Haar system by imposing "(G") = 1, as in the case of groups. 

2. For a pair groupoid G = M x M, on the other hand, (C.465) forces the system of 
measures to collapse to a single measure pw on M, ie., u“ = pw for each uu € M. 
For M = R”, we take U to be Lebesgue measure. 

3. For the tangent bundle G = TM (with fiberwise addition), which is essentially a 
bundle of abelian groups R”, eq. (C.465) forces each measure wu" on 


t'(u) = TM, (C.466) 


to be translation invariant. For M = R” (or, more generally, if TM is a trivial 
bundle), we take all ” to be the same and all equal to Lebesgue measure. 

4. For action groupoids x M, we have t~!(w) = G, and any left-invariant Haar 
measure dy on I’ yields a left Haar system on G as du” = dy, for each u € M. 

5. In case of a tangent groupoid G’, the t-fibers are indexed by (f,u), where h € 
[0,1] and u € M, so that a left Haar system consists of a family pw), It turns 
out that given any left Haar system (u1",u € M) on G, there exists a (suitably 
normalized) left Haar system (14,4 € M) on the vector bundle NG such that 
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pO) = ps (h=0); (C.467) 
pw") — 7"y" (A> 0), (C.468) 


where n = dim(G) — dim(M), defines a Haar system on G7; the extra factor i” 
in (C.468) is necessary and sufficient for this Haar system to satisfy the smooth- 
ness condition on (C.464). For example, if G = IR” x R” is the pair groupoid on 
IR", where each fiber G’ ~ R” is endowed with Lebesgue measure d”x, then the 
fibers R” of the vector bundle NyG = TR" should carry exactly the same mea- 
sure. To see this, in (C.464) we substitute G ~» G? and u ~ (h,y) (y € R”), so 
that for each f € C?(G7 ) the following function on [0, 1] x R” should be smooth: 


0.x) 4 [ avfo.yv): (C469) 


(Ay) oR ia d"x f(h,x,y) (> 0). (C.470) 


To interpret this condition, we put f = fog —!, where f is smooth on [0,1] x TR”, 
and o-! is given by (C.427) - (C.428). This transforms the above function into 


(0,y) | dv f(0,y,v)s (C.471) 
R" 
(ny) > [atv flt.y— div) (h> 0). (C472) 
R" 
We now define C*-algebras C*(G) and C(G), which depend on the choice of 


a left Haar system on G, but different choices lead to isomorphic C*-algebras. We 
start from C2(G), on which we define a convolution product and an involution by 


feats) = [aw 0) Fey"): (C.473) 
fC) = F024, (C.474) 

We then define a C*-algebra C*(G) as the completion of C?(G) in the norm 
fll = sup{llz fy}, (C.475) 


where the supremum is over all Hilbert space representations of C2°(G) that satisfy 


(ADI < [Fla = max FF}, (C.476) 


where the canonical L!-norm on the right-hand side is defined by 
Il! =sup f duo) Yok IFN? =sup f aw"Oy iro) (C477 
uEeM J Gy ueM JG" 


A more tractable possibility is to limit these representations to a selected class, such 
as the following one. Further to the t-fiber (C.463), we denote the s-fibers of G by 
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G,=s '(u), (C.478) 

which carries a canonical measure 
dpt,(x) = du*(x"). (C.479) 


This leads to Hilbert spaces 
Hy = L* (Gy Mu), (C.480) 


on which C?(G) can be represented through the formula 


Tul f) W(x) = ou HO) FO) YO) (We Hy,x € Guy €G"). (C481) 


Such representations automatically satisfy the bound (C.476); restricting the repre- 
sentations 7 in (C.475) to these 2, u € M, gives the reduced groupoid C*-algebra 
C;(G). In other words, C;(G) is the completion of C?(G) in the norm 


Ilf\l- = sup {|| (f)||,u € My. (C.482) 


One often has C}(G) = C*(G), but if G is for example a non-compact and semi- 
simple Lie group, then the two differ (in which case C*(G) is a quotient of C*(G)). 
Deferring groups to the next section, the other examples on our list are as follows. 


1. For a space G = M, the algebraic operations are 


fx s(x) = fag); (C.483) 
f(x) = fQ), (C.484) 

from which we obtain 
C;(M) =Co(M). (C.485) 


Indeed, G* = {x}, so with U(x) = 1 for each x € M, we obtain 


ene (C.486) 
™(f) = f(x), (C.487) 


and hence || ||, = ||,f||.03 the completion of C2 (M) in this norm is Co(M). 
2. A pair groupoid G = M x M, with left Haar system uw” = wu for all u € M, gives 


fag(ur) =f duow) fluw)a(w.r) (C.488) 

f(y) = fn), (C.489) 

which of course is reminiscent of the corresponding operations on matrices. Also, 
H, = L’(M,w); (C.490) 

maf) = f dulw). fo) ww). (C491) 
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where we wrote x = (v,u) and y = (u,w), and identified y(v) with w(v,u). With 
this identification, the representations 7, are the same for each u. Using the fact 
that C?(M x M) is dense in L?(M x M) and that integral operators (C.491) of 
Hilbert-Schmidt type are dense in the compact operators, we obtain 


C*(M x M) & Bo(L?(M)). (C.492) 


3. For a tangent bundle G = TM, we have, identifying T,,M with R”, n = dim(M), 


fxg(u,v) = fe dw f (u,v +w)g(u, —w); (C.493) 
f*(uv) = f(v,-u), (C.494) 
where we used local coordinates (u,v) on TM. Furthermore, we have 
H, = L?(T,M) = L7(R"); (C.495) 
mufw(r) = ff atwfluv-+w)y(—w), (C.496) 


which is diagonalized by a Fourier transform f ++ f (cf. Theorem C.109), with 


Au) = / a"vf(u,v)e””. (C.497) 
RR? 
This map therefore gives an isomorphism 
C* (TM) =Co(T*M). (C.498) 


4. The (reduced) C*-algebra of an action groupoid G = I" x M has operations 


feeltau) = [46 $(V8,w\e(S 8-7 'w)s (C499) 
Pw =f rly), (C.500) 

and the special representations 7, are given by 
H, = L(G); (C.501) 
m(fw(n) = | a8 £8, mW"). (C502) 
This gives the (reduced) transformation group C*-algebra (see the end of §C.18) 
C(T KM) =CO(,M). (C.503) 


5. The C*-algebra C*(G7) of a tangent groupoid will be analyzed in §C.19. 
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C.18 Group C*-algebras and crossed product algebras 


It can be shown that in cases 1-3 above we have C*(G) = C;(G). It is useful to give 
a more direct and general construction of both C*(G) and C;(G) in the case where 
G is a group or an action groupoid; although the former is a special case of the latter 
by taking the trivial G-action on a point, we treat the group case separately first. 
Let G be a Lie group, or, more generally, a locally compact group, which for 
simplicity we assume to be unimodular (so that it has a left Haar measure dx that is 
also right invariant). We turn C2(G), or, more generally, C.(G), into an algebra with 
involution by specializing (C.473) - (C.474) to groups, i.e. (changing y +> x7!y), 


fxg(x) = I dy f(y)g(y~'x); (C.504) 
Fe) = FO"). (C.505) 


Any unitary representation u of G on a Hilbert space H (assumed strongly continu- 
ous, as always) then gives rise to a representation u! of this *-algebra by 


wn = [ayflouts) (C.506) 
in that w! (f *g) =u! (f)w! (g) and w! (f*) =u! (f)*. Let 


If] = sup{ ||! (f)II}, (C.507) 


where the supremum is over all continuous unitary representations of G. 


Definition C.119. The group C*-algebra C*(G) of G is the closure of C?(G) or 
C.(G) in the norm (C.507). The reduced group C*-algebra C;(G) of G is the 
closure of C2(G) or C-(G) in the norm 


[fll = Meet AI (C.508) 
where uz, is the left-regular representation uy(G) on H = L*(G), cf. (7.52). 
The relationship between the two group C*-algebras is given by 


C*(G) = ul (C*(G)) © C*(G)/ker (uf ) (C.509) 


Definition C.120. A unitary representation u, is weakly contained in u2, if \\u! (f)|l 
< lu (f)|| for all f © C.(G). If every unitary representation of G is weakly con- 
tained in uz, and hence ker (ui) = {0} and C3(G) = C*(G), we call G amenable. 


It can be shown that G is amenable iff the commutative C*-algebra C,(G) of 
bounded continuous functions on G with sup-norm has a left-invariant state @, i.e., 


o(Lyf) = (f) (ye G, f € C,(G)). (C.510) 
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Here Ly f(x) = f(y-!x) as usual. This is the case, for example, for all compact 
groups, all abelian groups, and all solvable groups (and semi-direct products thereof, 
like the Euclidean group). Non-compact semi-simple Lie groups, like SL, (IR), or the 
Lorentz group, are not amenable, similarly for e.g. the Poincaré group. 

Bij construction, there is a bijective correspondence u u! between unitary rep- 
resentation of G and non-degenerate representations of C*(G) (which restricts to a 
bijection between unitary representation of G that are weakly contained in uz and 
non-degenerate representations of C;(G)). In one direction, this is given by (C.506), 
whilst in the other, one first decomposes = p as a direct sum of cyclic represen- 
tations with cyclic vectors Q;, and then, for each Q; in the sum, puts 


u(x)p(f)2i = p (Lx f) Qi. (C.511) 


Now take any C*-algebra A on which G acts, in that there is a continuous group 
homomorphism a : G + Aut(G), ie., for each x € G we have an invertible ho- 
momorphism a, : A — A such that Q, © Qy = Q,y and @ = id, (or, equivalently, 
a | = a,-1), and for each a € A, the function x 4 (A) from G to A is continuous. 
We turn the space C,(G,A) into a *-algebra by generalizing (C.504) - (C.505) to 


fxg(x) = ie dy f(y) (g(y!x)); (C512) 
f'@=0.Fe 4). (C.513) 


We construct representations of C,(G,A) as a *-algebra from pairs (u(G),(A)), 
where u is a unitary representation of G, and 7 is a representation of A (both defined 
on the same Hilbert space #) that satisfy the covariance condition 


7(O,(a)) = u(x)a(a)u(x)*. (C.514) 


Writing 7 x u! for the associated representation of C.(G,A), we put 


mul (f) = ff dea( f(a) us), (C.515) 


and define 


|| f|| = sup{||a > w! (f)||}, (C.516) 


where the supremum runs over all pairs (u(G),2(A)) satisfying (C.515). The clo- 
sure C*(G,A, a) of C.(G,A) in this norm is a C*-algebra called the crossed prod- 
uct or covariance algebra defined by G, A, and a. Once again, by construction 
there is a bijective correspondence (u,2) <> 1 x u! between pairs (u,7) satisfy- 
ing (C.515) and non-degenerate representations 7 x w= p of C*(G,A,q), in one 
direction given by (C.515), and in the other by 


u(x)p(f) Qi = p(Ox(Lxf)) Qi: (C.517) 
1(a)p(f)Qi = plaf) Qi. (C.518) 
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Here (Lyf) € C-(G,A) is the function y +> a&(f(x7!y)), similarly af € C.(G,A) 
is given by y+> af(y), and the cyclic vectors Q; are defined as in (C.511). 

To construct a reduced crossed product, we take any injective representation 
7,(A) on some Hilbert space K, and from it construct a new Hilbert space 


H=L7(G,K)=L’(G)@K, (C.519) 


consisting of all measure functions y : G — K for which [;,dx || y(x)||% < , with 


(9,¥) = L dx (P(x), W(x))K (C.520) 
as the inner product. This Hilbert space H carries a covariant pair (u(G),7(A)), viz. 


u(y) W(x) = w(y “x); (C.521) 
(a) W(x) = H(0,-1(a)) Ya), (C.522) 
and hence an associated representation 7 x u! of C.(G,A) given by (C.515), which 


by continuity extends to a representation p, of C*(G,A,q). As in the group case, we 
define C;(G,A, 0) as the closure of C.(G,A) in the norm || ||, = ||p,(f)||, or as 


Ci(G,A, 0) = p,(C*(G,A,)). (C.523) 


If G is amenable, we once again have C}(G,A,@) = C*(G,A, a), as for C3(G). 
The main case of interest to us is given by a group action G © Q, as above, which 
gives rise to a crossed product C*(G,Co(Q), @) = C*(G, Q) through the choices 


A =(C(Q); (C.524) 
a(f) = Lef, (C.525) 


ie., a(f)(q) = f(x-!q). The (reduced) crossed product Ci) (GQ), then, is the 
same as the (reduced) C*-algebra of the action groupoid G x Q. Identifying the 
spaces C.(G x Q) and C,(G,C,-(Q)), eqs. (C.512) - (C.513) now become 


f*9(x,q) = [4xfo.g80- sya) (C.526) 
Pugs fe Le ©). (C.527) 


The obvious candidate for a faithful representation of Co(Q) comes from a measure 
v on Q with support Q, so that we may take K = L?(Q,v) and 7,(f) = M7, 1..€, 
n(f)w = fy, f € Co(Q). Identifying L*(G) @ L*(Q) with L?(G x Q), this yields 


u(y) w(x,g) = wy 'x,q); (C.528) 
a(f)w(x.g) = F(x 'g)w(x.4); (C.529) 
pr(f)w(x,q) = | dy fly.xq) wi !x,q). (C.530) 
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C.19 Continuous bundles of C*-algebras 


As shown on Chapter 7, continuous bundles of C*-algebras form a mathematical 
bridge between the classical and the quantum worlds, but they also form a beautiful 
structure in their own right. In what follows, J is an arbitrary locally compact Haus- 
dorff space, but in the main text it is a subset of the unit interval [0,1] that always 
contains 0 as an accumulation point, so one may have e.g. J = {0, 1] itself, or 


T= (1/N)U{0}=1/N, (C.531) 


where N = {1,2,...}). In physics, J plays the role of the value set for Planck’s con- 
stant, but also below we generically write h € J, if only to avoid notational confusion 
with x € X (as Co(X) will be a typical fiber of the continuous bundles we study). 


Definition C.121. Let I be a locally compact Hausdorff space. A continuous bun- 
dle of C*-algebras over I consists of a C*-algebra A, a collection of C*-algebras 
(An)ner, and surjective homomorphisms Qj: A — Ap for each h € 1, such that: 


1. The function h > ||@,(a)||m is in Co(1) for eacha € A. 
2. Writing || - ||, for the norm in Ap, the norm of any a € A is given by 


lal] = sup || Pn(@)||n- (C.532) 
hel 


3. For any f € Co(1) anda € A, there is an element fa € A such that for each h € 1, 


Qn( fa) = f(h)pn(a). (C.533) 


A continuous (cross-) section of the bundle in question is a map h+-+ a(h) € Ap, 
h € 1, for which there is ana € A such that a(h) = Qp(a) for each h € I. 


Thus A may be identified with the space of continuous sections of the bundle: if we 
do so, the homomorphism @y is just the evaluation map at h. The structure of A as 
a C*-algebra then corresponds to pointwise operations on sections. The idea is that 
the family (Ap )ncz of C*-algebras is glued together by specifying a topology on the 
disjoint union Ll;<7Ay, seen as a fibre bundle over J. However, this topology is in 
fact given rather indirectly, namely via the specification of the space of continuous 
sections. This is reminiscent of Theorem C.23, which specifies the topology on a 
locally compact Hausdorff space X via the C*-algebra Co(X). More generally (the 
previous case being the trivial vector bundle E = X x C), the Serre-Swan Theorem 
about fiber bundles allows one to reconstruct the topology on a locally trivial vector 
bundle E + X from the (finitely generated projective) Co(X)-module Co(X,E) of 
continuous sections of E. As in Definition C.121, one has maps 9, : Co(X,E) — Ey 
given by evaluation at x, so that (C.533) holds. However, continuous bundles of 
C*-algebras need not be locally trivial; for us, this is even the whole point! 
Another way of looking at continuous bundles of C*-algebras starts from a non- 
degenerate homomorphism @ from Co(/) to the center Z(M(A)) of the multiplier 
algebra M(A) of A (see §C.10); we simply write fa for @(f)a, and similarly Co(1)A. 
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In this notation, nondegeneracy means that Co(J)A is dense in A. Given such a non- 
degenerate homomorphism @ : Co(J) > Z(M(A)), one may define fiber algebras by 


An = A/(Co(Ish) -A); (C.534) 
Co(Ish) = {f € Co(Z) | f(A) = 0}; (C535) 


since Co(I;f)-A is an ideal in A, the quotient A; is a C*-algebra. The projections 
@;, : A — Ap are then given by the corresponding quotient maps sending a € A to 
its equivalence class in Aj. In general, the function fi +> ||@;(a)||, is merely upper 
semicontinuous, so that one only obtains a structure equivalent to the one described 
in Definition C.121 if one explicitly requires the above function to be in Co(J), in 
which case clause 2 of Definition C.121 follows, too. 

It is easy to find “trivial” examples of continuous bundles of C*-algebras: fix 
some C*-algebra B and take A = Co(J,B) with pointwise operations. In that case, 
An = B for each h € I, and the map @ : A > B is given by (a) =a(h). 

It is not so easy to find nontrivial examples, even with isomorphic fibers (these 
were first given by Dixmier and Douady, who took the fiber algebras to be the 
compact operators Bo(H)). To connect classical to quantum, we need bundles over 
I C [0,1] as described above, with non-isomorphic fibers, of which the fiber Ao 
above fi = 0 is isomorphic to Co(X) for some (locally compact) phase space X, 
and hence is commutative, whereas all other fibers are noncommutative. One might 
say that it is the job of (deformation) quantization theory to construct such fields. 
Without proof, we now describe the main class of examples relevant to physics. 

As we have seen, each Lie groupoid G canonically defines an associated C*- 
algebra C;(G), in which C> functions on G endowed with a generalized convolution 
product (C.473) and involution (C.474) form a dense subspace. In particular, 


Cr(TM) = Co(T*M); (C.536) 
C*(M x M) © Bo(L?(M)), (C.537) 
where M is a manifold (without boundary) with tangent bundle 7M and cotangent 
bundle T*M. More generally, for any given Lie groupoid G one may define 
Ao = C;(NmG) (h = 0); (C.538) 
An = C;(G) (A> 0), (C.539) 
where NyG is the normal bundle to the embedding M — G, cf. (C.444). Now con- 
sider the tangent groupoid G’, which is a bundle over [0, 1] with fibers 
Gi =NuG (h=0); (C.540) 
Gi =G (h>0), (C.541) 


The interplay between the differential geometry of the tangent groupoid and the 
notion of (reduced) Lie groupoid C*-algebras is described by the following lemma. 
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Lemma C.122. The map C?(G") — C2(G{.) that restricts f to GC G" continu- 
ously extends to a surjective homomorphism j,: CX(G") + C*(G}), h € [0,1]. 
Various special cases and this lemma ultimately led to the key result of the 1990s: 
Theorem C.123. For any Lie groupoid G, the fibers (C.538) - (C.539) merge into 
a continuous bundle of C*-algebras over I = [0,1] with total algebra A = C*(G") 
and homomorphisms @,j,: A —> Ap as described in Lemma C.122. 
The same result holds for the full groupoid C*-algebras C*(G7) and C*(G7). 
For the pair groupoid G = R” x R”, as in the argument (C.469) - (C.472) we take 


some f € C?(TR"), seen as a function f € C*((0, 1] x TR”) that is independent of 
h. This yields a function fo@—! € C?(G"), and by construction, 


foo '(0,x,v) = flxv (C.542) 
foo "(h,x,v) =i (22 =) ( (h > 0). (C.543) 


By lemma C.122, the function @(f o@7!) is an element of 
Ay = Ci(TR"); (C.544) 
this element is just the function f. For h > 0, we see @p (fo@7') as an element of 
An & Bo(L?(R")), (C.545) 


through (C.490) - (C.491). Calling this element oF (7), we have 


ON ve) =a fl arys ee *) vi (C.546) 


We now use the isomorphism (C.536), implemented through the Fourier transform 


f(x,P) = I, d"vf(x,v)e’””; (C.547) 
. qd” 3 
FNS ya cae f(x,p)e7””. (C.548) 


Hence as an element of Co(T*R"), the operator @o(fog~!) is f. From this perspec- 
tive, using (C.548), eq. (C.546) may be rewritten in the more familiar form 


d" pd" 
ON AVE) = fo Gere OVO EG +y).P) 549) 


It follows that any f € Cz (TR") defines a continuous cross-section of the continu- 
ous bundle of C*-algebra defined by A = C*((R” x R")’), given by (C.547), and 


01 f €Co(T*R"); (C.550) 
hr OW (f) € Bo(L7(R")). (C.551) 
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See also §7.1. These formulae were written down for the special case M = R”, but 
similar results (based on the exponential map as defined in Riemannian geometry) 
apply to any manifold. Moreover, as explained in §87.2—7.4, Mackey’s theory of 
quantization based on systems of imprimitivity and induced group representations 
falls squarely under the above umbrella, where G is an action groupoid. 

We also employ continuous bundles of C*-algebras with non-isomorphic fibers 
even away from fi = 0. The construction of these fields relies on the following result, 
which is a special case of a more general claim; we just state the case we need, in 
which J = 1/N; continuity then imposes conditions at # = 0 only (as J is discrete 
elsewhere). We identify the total space A of a (continuous) bundle of C*-algebras 
with the space of its (continuous) sections, as explained at the beginning of this 
section; thus a € A C J], Ax takes the form a = {an }ner, an € An. 


Proposition C.124. Suppose one has a family {An} ner of C*-algebras over I =1/N, 
as well as a subset A C J],An that satisfies the following conditions: 


1. The set {Gp | @ € A} is dense in Ap for each h € 1. 
2. One has limy-+0 || || = ||4o|| for each a € A; 
3. The set A is a*-algebra (under pointwise operations). 


Let A consist of all a € [],An for which one has 
Jima Naija — 41 jx] = |]a0— aol} (4 eA). (C.552) 


Regard A as a C*-algebra under pointwise operations and norm (C.532), and define 
On(a) = ap. (C.553) 


Then (A, {An, Pi }ner) is a continuous bundle of C*-algebras (and is the unique such 
bundle whose space of sections contains A). 


The proof relies on the following lemma (which we state for general compact /). 


Lemma C.125. The total C*-algebra A of (sections of) a continuous bundle of C*- 
algebras is locally uniformly closed. That is, if a € [],Apn is such that for every 
lig € 1 and every € > 0, there exists b™ € A and aneighborhood NV of ho in which 
llan — bp || <€éforallh € WN, thena€a. 

Equivalently, if A (etc.) is a continuous bundle of C*-algebras, and a € [],An is 
such that the function h > ||ay — by|| lies in C1) for each b € A, thena € A. 


Proof. Since I is compact, it has a finite cover {U),...,U,} with associated partition 
of unity {u;}. With a and € as in the lemma, take f; € U; and b" also as in the lemma, 
and define b = Y;u;b". Then b satisfies sup;.<, ||an —bp|| < €, and also b € A, because 
of Definition C.121.3. Hence a € A by Definition C.121.2 and completeness of A. 
As to the equivalent version, given a € [];,A, and fig € I, because @y is surjective, 
there is a b” € A such that ay = ee The assumption in the second part then implies 
that the conditions in the first part are satisfied, such that a € A. 


We are now in a position to prove Proposition C.124. 
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Proof. We first show that A as defined in the proposition is locally uniformly closed. 
With the notation of Lemma C.125 and its proof, take @ € A, and define the functions 


faa: he ||ay — Gl; (C.554) 
fa: fi ||bRO — al]. (C.555) 
Since |(||X || — |¥||)| < ||X —Y||, one obtains 
| faa(t) — foa(h)| <€, (C.556) 
for all i € I. By assumption, fg is continuous, so that 
|foa(h) — foa(hio)| < €, (C.557) 
for all 4 in some neighborhood U’ of fig. Combining the two inequalities yields 
| faa(t) — faa(ho)| < 3e, (C.558) 


for all h € U’. Hence fg is continuous at any fig € J, so that a € A by Lemma C.125. 

Using this property, it is easily shown that A is a C*-algebra, and that condition 3 
in Definition C.121 is satisfied. It is clear from Definition C.121.1 and the definition 
of A in the proposition that A is maximal. On the other hand, according to the second 
part of Lemma C.125, A is minimal, so that it is unique. 


To close, let us explain to what extent we can say that a given section (a, /n)N of 
either one of our continuous bundles A or A‘? “converges” to its value ag. 


Proposition C.126. Let (ao,a)/y) and (49,44 iw) be continuous cross-sections of 
some continuous bundle A of C*-algebras over I = 1/N, such that 
dim lai — awl =0. (C.559) 


Then a}, = ao. In particular, if (40,41 /n) is a continuous cross-section, then ag is 
uniquely determined by the (ain) and we may symbolically write 


— li i . 
a9 = lim ay jy (C.560) 
Proof. The last part of Lemma C.125 states that the function defined by 


0 ++ |lao — aol; 


1/N + lain — a) jyll, 


is continuous on 1 / N (ie., continuous at 0). 
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C.20 von Neumann algebras and the o-weak topology 


In this section and in §C.24 we turn to special classes of C*-algebras that are occa- 
sionally used in quantum (field) theory. Since the arguments tend to become very 
lengthy and technical, we will only prove some key results (e.g. von Neumann’s 
Double Commutant Theorem), and mention other results without proof (references 
to which may be found in the Notes). This also applies to the next four sections. 
The subject of operator algebras historically started with what we now call von 
Neumann algebras, in honour of the founder of the subject (although, curiously, 
C*-algebras are not called “Gelfand—Naimark algebras”; perhaps they should!). 
The first result in operator algebras was and is the Double Commutant Theorem: 


Theorem C.127. Let M be a unital *-subalgebra of B(H). Then the following con- 
ditions are equivalent—and, if satisfied, define M to be a von Neumann algebra: 


(i) M’ =M; 
(ii) M is closed in the weak operator topology; 
(iii) M is closed in the strong operator topology. 


Recall that the commutant S' of any S C B(H) is defined by 
S’ = {a € B(H) | ab = babe S$}, (C.561) 


and that the bicommutant of S is S” = (S')’. If S* = S, in that a € S iff a* € S, then 
S’ is easily seen to be a unital *-algebra within B(H). Furthermore, it is obvious that 
S CS”, so that the passage S++ S” is some sort of a closure operation within B(H), 
comparable to the closure operation L++ L1+ within H itself. Theorem C.127 shows 
that if S is a unital *-algebra, the algebraic closure operation S ++ S” coincides with 
two topological closure operations. To this effect, recall also that: 


e The weak operator topology on B(H) may be defined by saying that a, > a 
(where (a, ) is some net in M) iff (@,(a, —a)w) > 0 for all 9, w € A; 

e The strong operator topology on B(H) yields convergence a, — a of some net 
(az) iff ||(a, —a)y|| > 0 for each w € H. 


Proof. The essence of the proof is already contained in the finite-dimensional case 
H =C", where the nontrivial claim in Theorem C.127 is: 


If M is a unital *-subalgebra of M,(C), then M” = M. 


In fact, all we need to prove is M” C M, since the converse inclusion is obvious. 
The idea is to take n arbitrary (and hence possibly linearly independent) vectors 
V1,..., Uy, in H, and, given a € M”, find some b € M such that av; = bv; for all 
i=1,...,n. Hence a=b, soa € M. To this end, we start with a single vector v € H. 

Form the linear subspace Mv = {mv | m € M} of H, with associated projection e 
(i.e. ew = w if w € Mv and ew = Oif w€ (Mv)*+). Then e € M’, and hence a € M" 
commutes with e. Since 1y € M, we have v € Mv, so VD = ev, and we compute 
av =aev = eav € Mv. Hence av = bv, for some b € M. 

Now run the same argument with the following substitutions: 
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e H~H"=H@::-@H (withn terms). 
e M~M, = ({diag(m,...,m) | me M}. 
© Vw v= OV; = (V1,..., Un). 


We then have (M,,)” = (M”),, so for any matrix a = diag(a,...,a) in (M”),, the 
previous argument yields a matrix b = diag(b,...,b) € M, such that av = bv. But 
this is av; = bv; for alli=1,...,n, so that a= b and hence M” CM. 

If H is infinite-dimensional, the above proof may be adapted by taking the closure 
of Mv in H, which gives (3) = (1). Finally, (1) => (2) = (3) is trivial. 


Corollary C.128. Let M be a unital *-subalgebra of B(H). Then the closures of M 
in the strong and weak topologies coincide with each other and with M". 


Corollary C.129. A von Neumann algebra is norm-closed, i.e., is a C*-algebra. 


Since S” = S’, the commutant of any self-adjoint set S* = S C B(H) is a von 
Neumann algebra. As a case in point, take a (strongly continuous) unitary group 
representation u: G > B(H). Then u(x)* = u(x7!), so u(G)! is a von Neumann 
algebra. In fact, any von Neumann algebra M takes this form, since one may take G 
to be the group of all unitaries in M (and u its defining representation). Furthermore, 
the bicommutant A” of any C*-algebra A C B(H) is a von Neumann algebra. An 
important example of this construction is the abelian von Neumann algebra W* (a) = 
C*(a)” generated by a self-adjoint operator a = a* € B(H), cf. (B.320). 

Although the weak and strong topologies on M appear in the fundamental dou- 
ble commutant theorem, the most important topology on a von Neumann algebra 
(besides the norm topology) is the so-called the o-weak topology (sometimes called 
the ultraweak topology). This topology corresponds to the following convergence: 


e One has a, — a o-weakly iff Tr (b(a, —a)) > 0 for each b € By (A). 


To begin with, as far as Theorem C.127 is concerned this topology is at least on a 
par with the weak and the strong ones: 


Theorem C.130. Let M be a unital *-subalgebra of B(H). Then M" = M (i.e. M is 
a von Neumann algebra) iff M is closed in the o-weak operator topology. 


This one is a bit more technical, so we just sketch the proof. 


Proof, Define a new Hilbert space H® = H & (7, whose elements v are infinite se- 
quences of vectors (01, 02,...) in H with Y; || v;||? < 02. The inner product is 


(vv Jae =) (v;, 07). (C.562) 


L 


The obvious (diagonal) embedding of B(H) in B(H™), whose image is denoted by 
B(H).., restricts to M C B(H), with image M.. C B(H®). Then the o-weak topology 
on B(H) is the relative weak topology on B(H).. (i.e., the weak topology on B(H™) 
restricted to B(H)..), so that Theorem C.130 follows from Theorem C.127. 


This brings us to an important refinement of Theorem C.127, called Kaplansky’s 
Density Theorem (which should actually be seen as a lemma for numerous results): 
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Theorem C.131. Let A C B(H) be a C*-algebra (or a*-algebra). Then the unit ball 
of A is dense in the unit ball of A" in the weak, strong, and o-weak topologies. 


The real significance of the o-weak topology comes from Sakai’s Theorem: 


Theorem C.132. A C*-algebra M C B(H) is a von Neumann algebra iff M is the 
(Banach) dual of a unique Banach space M,, (called the predual of M). 


We turn to the proof below. For example, by Theorem B.146, the predual of B(H) is 
B(H), = B,(H). (C.563) 

In the commutative case, entry 10 in Table B.1 in §B.9 gives 
Ee Ska) (C.564) 


the fact that L*(X,), acting on H = L?(X,) as multiplication operators, is a von 
Neumann algebra was established in §B.16. In the first example, the o-weak topol- 
ogy on B(H) obviously coincides with the weak*-topology defined by B(H).. 

In general, there is a canonical embedding M,. > M*, 6 +> @, with g(a) =a(@), 
cf. §B.9. Proposition B.46 then shows that the image of M,. in M* consists precisely 
of the weak*-continuous functionals on M (recall that the weak*-topology on M is 
the topology of pointwise convergence, seeing M as the dual of M..). If we now iden- 
tify @ with @, we have the following generalization of the observation just made: 


Theorem C.133. Let M C B(H) be a von Neumann algebra. The predual M,, of 
M (seen as a subspace of M*) coincides with the space of o-weakly continuous 
functionals on M, and hence the o-weak topology on M coincides with the weak*- 
topology in its role as the dual Banach space of M,,. 


o-weakly continuous functionals on a von Neumann algebra M are called normal. 
Proof. Identifying @ with @, we introduce the following spaces: 
M~ = {9 © B(A), | e(a) =0Va € M}; 
M** = {a€ B(H)| 9(a) =0V9 EM"*}. 


Having proved the theorem for M = B(H), i.e., (C.563), the key is to show that 


M++ =M; (C.565) 
M, ~ B(H),/M*, (C.566) 


where (C.566) denotes an isometric isomorphism of normed spaces. Since the right- 
hand side of (C.566) is a Banach space, so is the left-hand side. This yields the first 
claim. Combining (C.566) with (C.565) and the duality B(H) = B,(H)*, we have 


M* = (B(H),/M+)* =M++ =M. 


This is the second claim. The first equality sign is true, because if Y is a closed 
subspace of a Banach space Y, then (X/Y)* ={@ € X*|@o;Y =O}. 
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For the remainder of the theorem, recall that a, —> a o-weakly in M whenever 
(a, —a) + 0 for all 9 € B(A),.. By (C.566), this is equivalent to a, — a in the 
weak*-topology, since a possible component of @ in M+ drops out. 

We next prove (C.565). The inclusion M C M+“ is trivial. For the converse, pick 
a € M; since M is a von Neumann algebra, it is o-weakly closed, so its complement 
M° in B(H) is o-weakly open. Hence there are @ € B(H), and € > 0 such that the 
open neighbourhood @(a) = {b € B(H) : |p(a) — p(b)| < €} of a entirely lies in 
M®. So |@(a) — g(b)| > € for all b € M. This implies @(b) = 0 by linearity in b. 
Hence |@(a)| > €, soa ¢M++, hence M1+ CM. 

For (C.566), first note that M+ is a norm-closed subspace of B(H),. = Bi(H), 
which is a Banach space in the trace-norm (which coincides with the norm inherited 
from B(H)*, since the injection B, (H) — B(H)* is an isometry). Hence the quotient 
B(H),,/M* is a Banach space in the canonical norm ||@|| =inf{||o + w|| | we M+}, 
where @ is the image of @ € B(H), under the canonical projection, and the norm is 
the one in B(H)*. Let g' = @ | M be the restriction of @ € B(H), to M. It is clear 
that the map @! +> @ is well defined and is a linear bijection from M, to B(H),/M+. 
In fact, this map is isometric. First, one trivially has 


\|9'|| =sup{|p(@)| |a¢ Mu} = ai sup{|P(a) + W(a)| |ae Mu}, — (C.567) 


since y(a) = 0. But this is clearly majorized by 


|@l| = inf sup{|p(a) + w(a)|,a € B(A)1}, (C.568) 
wemM+t 


since now the supremum is taken over a larger set. Hence ||@!|| < ||@||. 

Conversely, for any @ € B(H), with ||@|| = 1, by Corollary B.41 there exists 
an a € B(H) with @ € M+, (a) = 1 and |la|| = 1. From (C.565), one then has 
\|o'|| > |o(a)| = 1 = ||@||. This finishes the proof of Theorem C.133. 


Half of Theorem C.132 evidently follows from Theorem C.133. The converse 
(‘if’) implication uses a refinement of the GNS-construction, where the state @ is as- 
sumed to be o-weakly continuous. In that case, using the theory of o-weakly closed 
ideals of von Neumann algebras, it can be shown that 2@(M) coincides with %(M)” 
and hence is a von Neumann algebra. Since normal pure state on a von Neumann 
algebra may not exist (for example, take M = L®(0, 1)), the ‘crazy’ Hilbert space H, 
in the proof of Theorem C.87 must be replaced by the perhaps even crazier direct 
sum Heo = Does, (M) Hg, where this time the sum is over all normal states on M. 
Similarly, in Lemma C.15 one should now have a normal state instead of a pure 
state. Otherwise, the proof that M has a faithful representation as a von Neumann 
algebra on a Hilbert space essentially follows the proof of Theorem C.87. 

Finally, uniqueness of the predual follows from Corollary C.139 below. 


Corollary C.134. Let M C B(H) be a von Neumann algebra. Each normal func- 
tional @ € M,, on M is of the form @(a) = Tr (ba), for some b € B,(H). In particular, 
@ is anormal state iff b is a density operator. 
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C.21 Projections in von Neumann algebras 


General C*-algebras need not have any nontrivial projections; think of Co([0, 1]). 
On the other hand, von Neumann algebras are generated by their projections: 


Theorem C.135. Let A(M) = {p © M | e* =e* =e}, where M is a von Neumann 
algebra. Then M is the norm-closure of the linear span of P(M), and M = P(M)". 


This is Corollary B.105. In addition, Y(M) is not just a set. 


Proposition C.136. The set Y(M) of projections in a von Neumann algebra M is a 
complete lattice under the partial ordering e < f iff ef = fe =e. 


Proof. Since e < f in M C B(H) iff eH C fH, the supremum e V f is the projection 
on eH + fH, whilst the infimum e A f is the projection on eH M fH. For arbitrary 
families (e,)aca of projections, V,e, equals the projection on the closure of the 
linear span of all subspaces Hy, = e, H, whereas /\,e, = e is the projection on their 
intersection. To show that the latter lies in M (provided all the e, do, of course), note 
that each unitary u € M’ satisfies wH, = Hy for all A, so that also u(Ny Hy) =), Ay. 
Hence eu = ue and so e € M" = M (since each element of a von Neumann algebra 
is a linear combination of at most four unitaries in it; the proof is similar to Lemma 
B.145). Finally, by de Morgan’s Law we have Ve, = (A,ez)~, with f> =1—f 
for any f € A(M). Hence also Vaze, € M. 


This is nice in itself, but is also implies a very important result about maps between 
von Neumann algebras. Recall that a (purely algebraic) isomorphism between C*- 
algebras (seen as *-algebras) is automatically isometric and hence norm-continuous; 
see Theorem C.62. An even better result holds for von Neumann algebras: 


Theorem C.137. A (purely algebraic) isomorphism ~ : M — N between von Neu- 
mann algebras (seen as *-algebras) is an isomorphism of Banach spaces as well as 
a homeomorphism with respect the o-weak topologies on M and N. 


This theorem only seems to have rather difficult proofs. One, based on Proposition 
C.136, is based on the following result. First, we say that a map 9: M — N of von 
Neumann algebras is completely additive if for any family (e,) in A(M), 


P(Vaea) = Va Glen). (C.569) 


Lemma C.138. Let @ : M > N be a homomorphism of von Neumann algebras. 


I. Q is O-weakly continuous iff it is completely additive. 
2. If @ is a (purely algebraic) isomorphism, then it is completely additive. 


The proof of claim 2 is easy, as is the implication from o-weak continuity to com- 
pletely additivity in claim 1. The converse implication, however, is quite difficult. 
In any case, Theorem C.137 now follows, so that we may speak of isomorphisms 
between von Neumann algebras without any ambiguity. 
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Corollary C.139. [f two von Neumann algebras are algebraically isomorphic, then 
their preduals M,, and N,. are isomorphic as Banach spaces. In particular (take M = 
N), the predual of a von Neumann algebra is unique (up to isometric isomorphism). 


A second proof of Theorem C.137 uses Theorem C.132 (and hence provides no 
non-circular proof of Corollary C.139), as follows. 


Proof. Since @ is isometric by Corollary C.129 and Theorem C.62, it induces a 
dual isomorphism (of Banach spaces) p* : N* — M*, with the property that M = 
(@*(N,))* under the map 


atr+ (9*(@) > @(@(a))) (@EM,@ EN,). (C.570) 


Uniqueness of the predual then yields g*(N,) = M,, which in turn implies that 
Q preserves pointwise convergent nets: if @’(a,) — o'(a) for all w’ € M,, then 
@(@(a,)) + @(@(a)) for all @ € N,. Hence @ is o-weakly continuous. 


Theorem C.137 shows that the notion of isomorphism to be used in the classifi- 
cation of von Neumann algebras M is unambiguous. There are two totally different 
cases of von Neumann algebras (only M = C falls in both classes): 


e Abelian von Neumann algebras, which equal their center (MM M’ = M); 
e Factors, which have trivial center (MMM' = C- 1). 


A factor has no nontrivial decomposition M = M, @ Mp, whereas an abelian von 
Neumann algebra (except M = C) does have such a decomposition (typically even 
many of them). Using von Neumann’s technique of direct integrals, which gener- 
alizes direct sums (and will not be reviewed here), the classification of general von 
Neumann algebras may be reduced to these two cases. We start with the first class. 

We know that if (X,2,) is some o-finite Borel space with associated Hilbert 
space L?(X, 1), then the commutative C*-algebra L®(X , 1) is mapped isometrically 
into B(L7(X, 1)) via f + mg, see Proposition B.73 and especially (B.240). If we de- 
note the image of this map by L*(X, 1) also, then L*(X, yu)" =L”(X, 1) by (B.346), 
so L*(X,W) C B(L?(X,)) is an abelian von Neumann algebra. In general: 


Theorem C.140. Let M Cc B(H) be an abelian von Neumann algebra, Then 
M~L*(X,), (C.571) 


for some locally compact space X and probability measure UL on X. 


If H is separable, this follows from Theorems B.116 (including the remarks after its 
proof) and B.117 in §B.16. The proof for arbitrary Hilbert space is quite technical 
and will be omitted, but the idea is to find an abelian C*-algebra A for which M = 
A”, upon which X = Y(A), and the measure is constructed such that (A) = 0 
iff Uy(A) = 0 for all unit vectors wy € H, with py defined similarly to (B.304). 
In general, one cannot take A = M, since ©(M) may not support such measures. 
Thus we have a complete and satisfactory characterization of abelian von Neumann 
algebras, including their projections: these are simply the (equivalence classes of) 
characteristic functions 14, where A € X is a Borel set in X (modulo null sets). 
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The advantage of this approach is that there are often simple models for X; we 
know from the classification of maximal abelian von Neumann algebras on separa- 
ble Hilbert space in §B.17 that X = [0,1] with (Lebesgue measure) and X = N (with 
counting measure) are enough in that case. However, the pair (X, 1) lacks intrinsic 
uniqueness properties. Thus it also makes sense to apply Theorem C.8 to abelian 
von Neumann algebras, so that M = C(X). Since by Theorem C.135, M has plenty 
of projections, which as elements of C(X) are realized by characteristic functions 
14, where A C X, the space X must have lots of clopen (i.e. closed and open) sets. 

It can be shown that X arises as the Gelfand spectrum of some abelian von Neu- 
mann algebra iff it is hyperstonean, where we say that a compact Hausdorff X is: 


e Stone if the only connected subsets are points (equivalently, a Stone space is 
compact, 7p, and has a basis of clopen sets). 

e stonean if it is Stone and the closure of each open set is open. 

e hyperstonean if it is stonean, and for any nonzero f € C(X,R™) there exists a 
completely additive positive measure 1 such that u(f) > 0. 


This replaces the classification of abelian von Neumann algebras up to isomorphism 
by the classification of hyperstonean spaces up to homeomorphism, which is hardly 
an improvement (the only other area of mathematics where such wacky spaces ap- 
pear is algebraic logic). However, we do obtain a nice relationship between the 
projection lattice of an abelian von Neumann algebra and its Gelfand spectrum (at 
this point please recall Theorem D.5 and surrounding text in Appendix D). 


Theorem C.141. The projection lattice A(M) of a von Neumann algebra M is 
Boolean iff M is abelian, in which case there is a homeomorphism 


¥(M) = .Y(P(M)) (C.572) 


between the Gelfand spectrum of M (as a commutative C*-algebra) and the Stone 
spectrum of Y(M) (as a Boolean lattice). Hence we have isomorphisms 


M ~C(.Y(Y(M))); (C.573) 
6(Z(M)) © Id\(A(M)), (C.574) 


as (commutative) C*-algebras and as frames, respectively. 


Proof. In the commutative case, the lattice operations in A(M) are given by 


eAf =ef; (C.575) 
eVf=e+f-—ef; (C.576) 
et = ly-e, (C.577) 


as may be verified by embedding M c B(H) and using the proof of Proposition 
C.136; eq. (C.577) is true for any M. One then finds that M is distributive, since 


eA(fVg) =ef t+eg—efg =(eAf)V (eg), (C.578) 
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and similarly with V and A swapped. Since Y(M) is orthomodular for arbitrary 
von Neumann algebras M and is distributive if M is abelian, it follows that Y(M) 
is Boolean. Conversely, if A(M) is Boolean, we may compute 


(eA(eAfy) =(eA(erV fr) = ((eAe)V(eAf-))> = (eA fo) He vs, 


and since f < gV f for any g, this implies f < (eA (eA f)+)+. Now f < g+ implies 
fg = gf =0,s0 
fleA(eA f)*) =(eA(eA f)*) f =0. (C.579) 


If g <e, theneAg =g, hence eAg~ +g =eA(ly—g)+g=e.Sog =e f gives 


e—(eAf)=eA(eAf)-. (C.580) 
Using (C.579) - (C.580) finally yields 


ef =((eAf)te—(eAf))f=(eAf)F+eACeAF) )f=eAFf, (C581) 


and since eA f = f Ae, we find ef = fe for any two projections e, f € A(M). Hence 
M is abelian by Theorem C.135. 

If we now realize the Gelfand spectrum 2 (M) as the multiplicative state space of 
M, and realize the Stone spectrum .7(A(M)) as the space Pt(Y(M)) of points of 
Y(M), then a homeomorphism 2 (M) = Pt(Y(M)) arises as follows: 


e First, the restriction @: A(M) > C of any multiplicative state @ : M — C must 
be {0, 1 }-valued. Using (C.575) - (C.577), it is then easy to show that the ensuing 
map 9: Y(M) — {0,1} is a homomorphism of Boolean lattices. 

e Vice versa, by Corollary B.104 a point g : A(M) > {0,1} extends by continuity 
toamap 9: M —C-. Since @ must preserve , this map is nonzero. By continuity, 
multiplicativity in general follows from multiplicativity on projections, which 
follows by running the previous point backward (or from Theorem C.168). 


Finally, (C.573) and (C.574) follow from (C.572) and the Gelfand isomorphism 
(Theorem C.8) and eq. (D.35), respectively. See also Theorem C.168 below. 


Note that (C.574) is a special case of Corollary C.84, for if M is a commutative 
von Neumann algebra, and H(M) its frame of heriditary subalgebras, we have 


H(M) ~ Idl(A(M)); (C.582) 
J+ {e€ P(M) | Me CJ}, (C.583) 


whose inverse maps an ideal J C Y(M) to the norm-closure of Ue; Me in M. In 
particular, if J is o-weakly closed, then J = Me for a unique projection e € A(M), 
in which case the right-hand side of (C.583) is just the principal ideal | e. To see 
this special case, we quote a useful result about arbitrary von Neumann algebras: 


Proposition C.142. Let I be a o-weakly closed left (right) ideal in a von Neumann 
algebra M. Then there is a unique projection e € A(M) such that I = Me (I = eM). 


Indeed, e is the o-weak limit of any approximate identity in J. 
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C.22 The Murray—von Neumann classification of factors 


After this analysis of abelian von Neumann algebras, we now turn to their opposites, 
viz. factors. The main tool in the classification of factors, introduced by Murray and 
von Neumann, is a new partial ordering < on the projection lattice A(M), which is 
defined for general von Neumann algebras M. Unlike the familiar partial ordering 
< (see Proposition C.136), < gives a total ordering on Y(M) if M is a factor. 


Definition C.143. Let Y(M) be the projection lattice of a von Neumann algebra 
M. We say that e ~ f in A(M) iff there exists u € M such that u*u = e and uu* = f. 
Subsequently, we write e < f if there is e' € P(M) withe ~e' ande’ < f. 


It is easy to show that ~ is an equivalence relation. The operator u in this definition 
is unitary from eH to fH, vanishes on (eH)*, and has range fH. Such an operator 
is therefore a partial isometry (cf. Definition A.27), with initial projection e and 
final projection f. It follows that a necessary condition for e ~ f is that dim(eH) = 
dim(fH), but (unless M = B(H)) this is by no means sufficient, since the unitary 
u that maps eH to fH is required to lie in M. For example, if H = C @C, then 
e = diag(1,0) is equivalent to f = diag(0,1) with respect to M = M2(C), but not 
with respect to M = D2(C) =C @C (ie., the diagonal 2 x 2 matrices). 

To see how natural this definition is, consider a unitary representation u of a 
group G on H. If H; C H is stable under u(G), i = 1,2, then the restrictions u; of 
u to H; are unitarily equivalent precisely when e; ~ e2 with respect to M = u(G)' 
(where e; is the projection onto Hj). Furthermore, uv; is unitarily equivalent to a 
subrepresentation of uz iff e; Se. More generally, if N C B(H) is a von Neumann 
algebra, with stable subspaces Hj, i = 1,2, then the restrictions Nj; to Hj are unitarily 
equivalent iff e; ~ e2 with respect to M = N’, et cetera. 

One may compare projections in M with sets and compare <, ~, and < with C 
(inclusion), = (isomorphism), and <> (the existence of an injective map), respec- 
tively. The Schréder—Bernstein Theorem of set theory (which von Neumann knew 
well) states that if X — Y and Y ~ X, then X = Y. Similarly, it can be shown that: 


Proposition C.144. Ife < f and f Se, thene ~ f. 

The special role of factors with respect to the partial ordering < now emerges. 
Proposition C.145. [f M is a factor, then < is a total ordering (i.e., eS f or f Se). 
The property of a factor that leads to this result is: 


Lemma C.146. Let M be a factor. For any nonzero projections e, f € A(M), there 
are nonzero projections e', f' € A(M) such that e' <e, f' < f, andeé’ ~ f". 


The first step in the Murray—von Neumann classification of factors is as follows: 


Definition C.147. A projection e in M is called finite if f ~ e and f < e for some 
f € A(M) implies f =e, and minimal if f < e, f © A(M), implies f =e or f =0. 

Accordingly, a factor M is called finite iff 11 is finite, semifinite iff 144 majorizes 
a finite projection, and purely infinite iff all nonzero projections are infinite. 
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For M = B(H), which is evidently a factor, a projection e is (in)finite iff dim(eH) 

is (in)finite, so that B(H) is finite iff H is finite-dimensional, and semifinite other- 

wise. Surprisingly, we will see that finite factors different from M,,(C) exist, as do 

semifinite factors different from B(¢7). Even purely infinite factors (initailly defined 

as what was left out from the previous two cases) turn out to exist (even in physics). 
We first rephrase Definition C.147 in terms of generalized traces. 


Definition C.148. A trace on a von Neumann algebra M is a map 


tr: M4 — [0,°%] (C.584) 

Satisfying 
tr(A-a+b) =A-tr(a)+tr(b) (a,b E M,,A = 0; (C.585) 
tr(aa*) = tr(a*a) (a€ M). (C.586) 


Equivalently, tr(uau*) = tr(a) for all a € Mx. and unitary u € M (so that uau* € M*). 
A trace is finite if tr(a) < © for all a € M,, semifinite if for any a € M.. there is 
a nonzero b <ain M, for which tr(b) < %, and infinite otherwise. 


The usual trace Tr is a trace tr on B(H) in this new sense, which is finite iff dim(H) 
is finite. As we will see, other factors admit other traces. The following result could 
have been used as a definition of (semi)finite and purely infinite factors. 


Proposition C.149. A factor is (semi)finite iff it admits a faithful 0-weakly contin- 
uous (semi)finite trace, and is purely infinite otherwise. 


It can be shown that a finite trace on a factor is automatically o-weakly continuous, 
so a factor is finite iff it admits a faithful finite trace. Hence we recover the fact that 
B(H) is finite iff dim(H) < c¢, and semifinite otherwise. For a completely different 
kind of trace, defined on factors remote from B(H), we turn to discrete groups G. 
For these, Haar measure is simple the counting measure, so that L7(G) = (°(G), and 
convolution (C.504) and involution (C.505), initially defined on C,(G), are given by 


fee = fer eo £O=fe. (C.587) 


yeEG 


According to Definition C.119, the reduced group C*-algebra C;(G) is the norm- 
closure of the *-algebra in B(L?(G)) containing all operators 


ul (f)v(x) = ¥ fO)wor'x) (f €C-(G)). (C.588) 


yEG 


Thus C*(G) is realized as a concrete C*-algebra of operators on B(¢7(G)), so that, 
following von Neumann himself, we may form the group!von Neumann algebra 


w*(G) =c*(G)". (C.589) 
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Theorem C.150. The group von Neumann algebra W*(G) of a countable group is 
a factor iff all nontrivial conjugacy classes in G (i.e., all except {e}) are infinite. 


In that case, we say that G has (or “is”) icc, 1.e., has infinite conjugacy classes. 


Proof. From (C.587), for f € C.(G) we have f * g = gx f for each g € C,(G) iff 
f (yxy!) = f(x) for all x,y € G. In other words, f lies in the center of C.(G) C 
W*(G) iff f is constant on each conjugacy class of G. If G is icc, this implies that 
f can have support only at e, i.e., f =A-6., A € C. Noting that 6, is the unit in the 
algebra C.(G), this proves the claim, except for the fact that we should extend this 
argument from C,(G) to W*(G), which by Theorem C.127 is its strong closure. 

The key to this extension is the fact that one has f * g(x) = (R,-1 f*,g) for f and 
g inC.(G) , where Ry f(y) = f(yx) and the inner product is in 7(G). Hence 


If* a(x) = [Raf 8) S Re-1fllallglle = IIFllallslle. (C.590) 


so that the sum in (C.587) is actually defined and converges (absolutely) for f,g € 
@(G). This also shows that if f, strongly converges to some a € B(€*(G)), ice., 
Il. fn * YW —aw|| > 0 for each y € (7(G), then aw = f * y, where f € (7(G) is the 
limit of (f,) seen as a sequence in £7(G). Hence W*(G) c ¢?(G), and the above 
computation of the center of W*(G) remains valid: we have f € W*(G) DW*(G)’ 
iff f is constant on each conjugacy class of G. Conversely, any f that is constant 
on some finite conjugacy class (different from {e}) and zero elsewhere is central 
without being a multiple of the unit. 


Whether or not G has icc, we have a map tr: W*(G) — C, defined by 


tr(f) = fle), (C.591) 


which satisfies (C.585) - (C.586) and hence defines a finite trace on W*(G). Also, 


tr(f) = (6c, f * de), (C.592) 


so this trace is o-weakly continuous. 
Corollary C.151. [f G has icc, W*(G) is a finite factor non-isomorphic to any B(H). 


Since G must obviously be infinite for it to have icc, W*(G) is infinite-dimensional, 
and hence W*(G) 4 M,,(C) for any n € N. Furthermore, if H is infinite-dimensional, 
then B(H) does not admit any o-weakly continuous finite faithful trace: 


Proposition C.152. Any two nonzero o-weakly continuous (semi)finite traces tr,tr’ 
on a (semi)finite factor are proportional, i.e., tr’ = Atr for some A € Rt. 


See also Theorem C.155 below. Consequently, since Tr and tr are both o-weakly 
continuous, and Tr (17) =e on B(H), whereas tr(1/2(g)) = 1 on W*(G), we con- 
clude that W*(G) 4 B(H) for any H. Note also that (still assuming that G has icc), 
all projections in W*(G) are finite, and W*(G) has no minimal projections (see be- 
low), whereas B(H) has both finite and infinite projections, and also has plenty of 
minimal projections, namely those with one-dimensional range. 
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Do such “icc” groups actually exist? In fact, there are infinitely many of them: 
each free group onn > | generators is an example. Another example is the group G.. 
of finite permutations of N. A j-cycle is a cyclic permutation of j objects (called the 
carrier of the cycle in question). Any element p of G.. = U, Gy is a finite product 
of j-cycles with disjoint carriers, and for each j € N, the number of j-cycles in such 
a decomposition of p is uniquely determined by p. Two permutations in G.., then, 
are conjugate iff they have the same number of j-cycles, for all j € N 


We present the type classification of factors due to Murray and von Neumann. 
Definition C.153. A factor M is said to be of type: 
e Lif it has at least one minimal projection, subdivided into: 


— Type tn, (n€N) ifM is finite and 1y is the sum of n minimal projections. 
— Type l if M is type 1 and semifinite but not finite. 


e Uifit has no minimal projections, but has some nonzero finite projection, with: 


— Type iy if M is type I and finite. 
— Type Il. if M is type U1 and semifinite but not finite. 


e i ifall nonzero projections are infinite. 
A nice understanding of these types arises from a construction similar to the trace. 


Definition C.154. A dimension function on a von Neumann algebra M is a func- 
tion d: P(M) — [0,] such that d(e) < ~ iff e is finite, d(e + f) =d(e)+d(f) if 
ef =O(i.e, eH L fH), and d(e) =d(f) ife~ f. 


Paraphrasing results in Murray and von Neumann’s great series of papers, we have: 


Theorem C.155. For any von Neumann algebra M, the restriction of a trace to 
Y(M) is a dimension function. If M C B(A) is a factor, with H separable, then: 


I, Any -weakly continuous trace on M restricts to a completely additive dimension 
function with the additional property that d(e) = d(f) if and only if e~ f. 

2. Any dimension function with this additional property arises from a o-weakly 
continuous trace, and hence is completely additive, and unique up to scaling. 

3. In that case, the dimension function d induces an isomorphism between Y(M)/~ 
and some subset of [0,~]. Suitably scaling d, this subset must be one of: 


e {0,1,2,...,n}, for some n € N (type I,). 
NUco (type I..). 

(0, 1] (type 11). 

[0,9] (type I.o). 

{0,00} (type Il). 


We may now strengthen the few examples we had so far in the following way: 


Corollary C.156. e Jf dim(H) =n, then B(H) is a factor of type Ip. 
e If dim(H) =~, then B(A) is a factor of type loo. 
e Let G be icc. Then W*(G) is a factor of type Ih. 
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C.23 Classification of hyperfinite factors 


Throughout this section we assume that our von Neumann algebras M C B(H) act 
on a separable Hilbert space H. We say that M is hyperfinite if M = (U,M,,)", for 
a family of finite-dimensional von Neumann subalgebras M,, C M with M, C My11. 
For example, M = B(H) is hyperfinite. If G is a group such that G = U,G, for finite 
subgroups G,, C Gy+1, as is the case e.g. for the (icc) group G.. = UnG» of finite 
permutations of N, then the associated von Neumann algebra W*(G) is hyperfinite. 
Murray and von Neumann partly classified hyperfinite factors, as follows: 


Theorem C.157. Let M Cc B(H) be a hyperfinite factor. 


e If M is type In, thenM = M,(C). 
e If M is type Ix, then M = B(€). 
e IfM is type i), then M =W*(G.). 


The unique hyperfinite 11,-factor W*(G..), which turns out to be isomorphic to 
W*(G) for any finitely generated icc group G, is usually called R. Similarly (and 
trivially), B(¢*) = B(H’) for any separable infinite-dimensional Hilbert space H’. 
An example of a hyperfinite 11., factor is also quickly found, viz. M = R@ B(0’), 
but Murray and von Neumann were unable to classifiy such factors. About type III, 
they knew almost nothing, except for a couple of examples from ergodic theory. 
Between 1971-1975, Connes made two decisive steps forward in this area: 


1. Dividing type II factors into IIl,, A € [0, 1], by means of a new invariant. 
2. Completely classifying hyperfinite type I. and type II factors, as follows: 


There is a unique hyperfinite 11.. factor, namely R ® B(£’). 

There is a unique hyperfinite 111, factor (Connes and Haagerup). 

There is a unique hyperfinite 111, factor for each A € (0,1). 

There is an infinite family of hyperfinite Ilo factors, completely classified by 
the so-called flow of weights introduced by Connes and Takesaki. 


We list II; separately from II, for A € (0,1) for two reasons: first, “hyperfinite 
III,” turns out to be the factor occurring in quantum field theory and quantum statis- 
tical mechanics of infinite systems, whereas III, for A € (0,1) seems artificial) and 
second, the proof of uniqueness of the hyperfinite 111, factor is much more difficult. 

An important technical tool of Connes was his own profound discovery that a 
von Neumann algebra M C B(H) is hyperfinite iff it is injective, in that there exists 
a o-weakly continuous conditional expectation E : B(H) — M, that is, a linear 
map E : B(H) — B(#) such that E(a) € M and E(a*) = E(a)* for all a € B(H), 
E* = E, and ||E|| = 1. It follows that E(abc) = aE(b)c for all a,c € M, b € B(A). 
The equivalence of hyperfiniteness and injectivity implies, for example, that if M = 
N ®B(€*) is hyperfinite, then so is N. Another crucial tool was the Toemita-Takesaki 
theory, which we briefly summarize (this theory was paralleled by simultaneous 
and independent work in mathematical physics by the German-Dutch mathematical 
physics trio Haag—Hugenholtz—Winnink, which among other things allowed a direct 
definition of thermal equilibrium states in infinite volume, see §9.6. 


C.23 Classification of hyperfinite factors 755 


Definition C.158. A von Neumann algebra M C B(H) is in standard form if H 
contains a unit vector Q that is cyclic and separating for M. 


Recall that Q is separating for M if aQ 4 0 for all nonzero a € M, and that Q is 
cyclic for M iff it is separating for M’. Any von Neumann algebra can be brought 
into standard form. For separable H, this follows by picking an injective density 
operator p on H, whose associated state @(a) = Tr (pa) is faithful (in that @(a*a) > 
0 for all nonzero a € M), and passing to the GNS-representation 1(M) & M. For 
example, M = B(H) acting on H is not in standard form, but acting on B2(H) by left 
multiplication it is, where B2(H) is the Hilbert space of Hilbert-Schmidt operators 
on H with the familiar inner product (a,b) = Tr(a*b). If p € By (A) is an injective 
density operator on H, then Q = \/p € B2(H) brings M into standard form. In this 
case, M’ = B(H)°? (where the suffix “op” means that multiplication is done in the 
opposite order, i.e. ab in B(H)°? is equal to ba in B(H)), which acts on B2(H) by 
right multiplication. If H = C”, one simply has B(H) = B2(H) = M,,(C). 

Let M Cc B(H) be in standard form. Tomita introduced the (unbounded) antilinear 
operator S as the closure of the operator So having domain D(So) = MQ and action 


So(aQ) = a*Q. (C.593) 


This domain is dense because Q is cyclic for M, the action is well defined since 
Q, is separating for M, and So indeed turns out to be closable, with closure S. Any 
closed operator a has a polar decomposition a = v|a|, where v is a partial isometry 
and |a| = V/a*a. We write the polar decomposition of the above operator S as 


S=JAl?, (C.594) 


where J is an antilinear partial isometry, and A = S*S. Since S is injective with dense 
range, J is actually anti-unitary, satisfying J* = J and J? = 1. Furthermore, A > 0, 
so that A” is well defined for ¢ € R: writing A = exp(h) for the self-adjoint operator 
h =logA, we have A" = exp(ith). We then have the Tomita-Takesaki Theorem: 


Theorem C.159. Let M C B(H) be a von Neumann algebra in standard form. Then: 


e M=JMJ = {JaJ | ae M}. 

e For eacht € Randa €M, the operator %(a) = A” aA~" lies in M. 

e The map t ++ 0 is a group homomorphism from R to Aut(M) (i.e., the group of 
all automorphisms of M), which is continuous, in that for each a € M the function 
t+ (a) from R to M (with o-weak topology) is continuous. 


The image of R in Aut(M) by @ is called the modular group of M associated with 
the cyclic and separating vector Q (or rather, with the associated o-weakly con- 
tinuous faithful state @). Simple examples show that the modular group explicitly 
depends on the vector Q. In his thesis, Connes analyzed the dependence of a@ on Q, 
and showed it was innocent. To state the simplest version of his result, assume that 
H contains two different vectors Q; and Q2, each of which is cyclic and separating 


(i) 


for M. We write on" for the modular group derived from Q,;, i = 1,2. 
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Theorem C.160. There is a family U; of unitary operators in M (t € R), such that 


of (a) = U,ok” (a)U;; (C.595) 
Uses = Usa”) (Y;). (C.596) 


Proof. The proof of this theorem is Connes’s favourite (as he declared in an inter- 
view), SO we present it in some detail. It is based on the following idea. Extend M 
to Mat2(M), i.e., the von Neumann algebra of 2 x 2 matrices with entries in M, and 
let Mat2(M) act on H2 = H @H in the obvious way. Subsequently, let Mat2(M) 
act on Hy =H®A OH OH = AM GA by simply doubling the action on Hz. The 
vector Q = (Q),0,0,Q2) € Hy is then cyclic and separating for Mat2(M), with cor- 
responding modular operator A = diag(A,,A4,A3,A2). Here A; and Ap are just the 
operators on H originally defined by Q; and Qo, respectively, and A3 and Ay are 
certain operators on H. Denoting elements of Mat2(M) by 


a= e 2) (C.597) 
a21 a22 
we then have 
i {ad a" (a) 0 
A" ( ja — ~ (2) : (C.598) 
Da 0 (a) 
(Vip) — (AfanAr” AffarAy" \. 
O,; (a) a, Ge Ai ayA," ? (C.599) 
~(2)(a\ — (AdanAs” AfapA," 
a, (a) 2 Ge A," AY aA," : (C.600) 


But by Theorem C.159, the right-hand side of (C.598) must be of the form diag(b, b) 
for some b € Mat2(M), so that af? (a) = a”? (a). This allows us to replace Aj aA," 
in (C.599) by AJ ay2A,". We then put U, = Aj'A,", which, unlike either Aj’ or A,", 
lies in M, because each entry in af”) (a) must lie in M if all the a;; do, and here we 
have taken aj7 = 1. All claims of the theorem may then be verified using elementary 


computations with 2 x 2 matrices. For example, combining 


GEG Gales es 


with the property af) (ab) = af (a) af) (b), we recover (C.595). Using the identity 


0U,\  (01\/00 
(00) = (00) (ou. cm 


evolving each side to time s yields (C.596). A proof from The Book! 
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We say that an automorphism y : M — M is inner if there exists a unitary element 
u € M such that y(a) = uau* for all a € M. The inner automorphisms of M form a 
normal subgroup Inn(M) of the group Aut(M) of all automorphisms, with quotient 
Out(M) = Aut(M)/Inn(M). Theorem C.160 shows that the image 2(a(R)) of the 
modular group in Out(M) under the canonical projection 2 : Aut(M) > Out(M) is 
independent of Q, and invariants of this image will be invariants of M itself. 

Such invariants are trivial if M is a factor of type I or II, since in that case 
m(a(R)) = {e}; to see this in the finite case (i.e., type I, or type 111), take a finite 
trace t on M and check that A = | for 1;(M) = M. For the semifinite but not finite 
case (i.e., type I.. or type II..), a slight generalization of the GNS-construction leads 
to the same conclusion. To find invariants for type III factors, we therefore need to 
extract information from the modular group f ++ o; up to inner automorphisms. 


Definition C.161. Let @ : R > Aut(M) be a continuous action of R on M, defining: 


M% ={xEM | a,(x) =xvVt € R}; (C.603) 
M. ={xE€M|xe=ex=x} (e€ P(M")). (C.604) 


e The Arveson spectrum sp(@) of & consists of all p € R for which there is a 
sequence (Xn) in M with ||xp|| = 1 and limp. || (xn) — e!?"Xp|| = OVE € R. 

e Foreache € P(M"), the map 0 :M — M restricts to of : Me + Me, defining a 
(group) homomorphism af :R — Aut(M,), tH af. The Connes spectrum of o 
is (a) =exp(I'(@)) C RY, where D(a) = Nozecpeua) Sp(@*) CR. 


The Connes spectrum I"(a@) is a closed subgroup of R;, which has the great virtue 
that if 7(@(R)) = z(a’(R)), then (a) =I'(a’). So if @ is the modular group of 
M with respect to some state @, then I’(@) is independent of @, and may therefore 
be called (M). This invariant can also be defined through the usual spectrum of 
self-adjoint operators on Hilbert space. To this effect, Connes defined and proved 


S(M)=(|o(40)=  f(])  o(Ay,), (C.605) 
0) 0Ace P(M%) 


where the first intersection is over all o-weakly continuous faithful states @ on M, 
whereas in the second one takes a fixed o-weakly continuous faithful state @ on M, 
and restricts it to @. = Q\m.- Furthermore, Aw denotes the operator A on Hw, defined 
with respect to the usual cyclic unit vector Q of the GNS-construction, etc. If M is 
a type I or II factor, one has S(M) = {1}, whereas 0 € S(M) iff M is type III. 
Connes showed that [(M) = S(M) MR}, and the known classification of closed 
subgroups of R7 yields his path-breaking parametrization of type II factors: 


Definition C.162. Let M be a type I factor. Then M is said to be of type: 
e Ilo if (M) = {1}; 

e I1ly, where A € (0,1), if T(M) = A2; 

e il; if (M)=R>z. 


The unique hyperfinite 111, factor appears throughout algebraic quantum field the- 
ory, where it plays the role of a universal algebra of localized observables. 
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C.24 Other special classes of C*-algebras 


There are many other special classes of C*-algebras apart from von Neumann alge- 
bras and commutative C*-algebras. The classes we consider here contain both com- 
mutative and non-commutative C*-algebras; in the spirit of (exact) Bohrification, 
whenever possible we try to characterize them through properties of their (maximal) 
commutative subalgebras. Like the von Neumann algebras already studied, each 
class in this section is sandwiched between the finite-dimensional C*-algebras, i.e. 
those C*-algebras that are finite-dimensional as a vector space (which it contains), 
and the real rank zero C*-algebras defined below (in which it is contained). 
Finite-dimensional C*-algebras admit a straightforward classification: 


Theorem C.163. Every finite-dimensional C*-algebra A is isomorphic to a direct 
sum of matrix algebras, i.e., A = @xMn, (C), where n, € N, and the sum is finite. 


Proof. Let A be a finite-dimensional C*-algebra, and take the injective representa- 
tion T= Byer) Zo On He = Boep(a) Ho, where P(A) is the pure state space of 
A; cf. the last stage of the proof of Theorem C.87. The proof now unfolds: 


1. Since Hy is the closure of %7(A)Q@, it must be finite-dimensional. 

2. Since each @ is pure, by Theorem C.90 we must have 7 (A)” = B(H). 

3. By Theorem C.127, 4 (A)” equals the weak or strong closure of 7(A), but since 
this algebra is finite-dimensional by step 1, these closures coincide with 7(A), 
and hence 7@(A) = B(Hw) = M,(C), where n = dim(H@). 

4. One can find an injective subrepresentation 7; of 7 using only a finite number of 
pure states (proof by contradiction to dim(A) < ©), so that 7;(A) = A. 


The real rank of a C*-algebra A is a non-commutative generalization of the 
(Lebesgue) covering dimension of a non-empty space X, defined as follows. First 
say that dim(X) <n iff every open cover of X has an open refinement Y for 
which every x € X is contained in at most n+ 1 elements of %. We then say that 
dim(X) = n iff dim(X) < n but dim(X) ¢ n— 1 (such n need not exist). 

If X is a compact Hausdorff space, then dim(X) = n iff n is the smallest integer 
n such that for every f € C(X,R"*!) and e€ > 0, there is g € C(X,R”*!) such that 
g(x) £0 for all x and || f — g||.. < €, where || f||.. = sup,cy{|f(x)|}. If no such n 
exists, we say that dim(X) = ~. If g: X > R"*! is described by its coordinates 
(g1,---,8n41), then g(x) A 0 iff Y7*} gx (x)* > 0, or equivalently, ), 87 is invertible 
in C(X). We may replace the usual norm ||v|| in R’*! by the equivalent max-norm, 
ie., ||v|] = max;,{|v;|}, where v = (v1,...,V,41). If we do so, we may generalize the 
covering dimension to possibly noncommutative unital C*-algebras, as follows. 

Let A” =A@---@A (with n terms) be the C*-algebra A x --- x A with pointwise 
operations and norm ||(a1,...,@)|| = max;{||a;||}. Let Q(A”) be the set of all self- 
adjoint elements (a),...,@n) in A" for which ¥;;a? is invertible (i.e. in A). The real 
rank rr(A) of a unital C*-algebra A is defined as the smallest integer n for which 
Q(A"*') is dense in A”*!, ie., if for every a € A?**! and € > 0, there is b € Q(A"*!) 
such that ||a— bl] < €. If no such 7 exists, we define rr(A) = ©. If A has no unit, we 
define its real rank as rr(A) =1r(A), i-e., as the real rank of its unitization. 
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Taking A = C(X), it follows from the previous paragraph that 
rr(C(X)) = dim(X). (C.606) 


Now dim(X) = 0 iff X has a basis of clopen sets, and if X is compact Hausdorff, 
then dim(X) = 0 iff X is a Stone space. Hence from (C.606) we immediately have: 
Proposition C.164. If A is a commutative C*-algebra, rr(A) = 0 iff (A) is Stone. 


This makes dimension zero somewhat pathological. On the other hand, for non- 
commutative C*-algebras real rank zero is ubiquitous. Note that if a = a* and a? 
is invertible, then its inverse is positive, too, and has a square-root which inverts a. 
Thus A has real rank zero iff its invertible self-adjoint elements are dense in Aga. 


Proposition C.165. Any von Neumann algebra has real rank zero. 
Proof. For a € Aga and € > 0, with A C B(H), use Theorem B.102 to define 
b = (idg(a) + (5€ + Lo(a) — idg(a)) * 1f-e/2,e/2)) (@)- (C.607) 
Using (B.322), we may then compute 
Ila — || < |]3€- Loca) — idg(a) + 1[—-e/2,¢/2] lo 


< [13+ Lo(a|leo + llide(a) * L—-e/2,¢/2]IIe0 
< le +he=e. (C.608) 


Writing (C.607) as b = f(a), the function f € @(o(a)) satisfies f(x) =x if x ¢ 
[—e/2,€/2] and f(x) = Se if x € [-€/2,e€/2); either way, f(x) #0. Hence f is 
invertible in A(o(a)), and therefore b = f(a) is invertible in W*(a) and in B(H). 


We now turn to classes of C*-algebras that are sandwiched between the finite- 
dimensional ones at the lower end and those with real rank zero at the upper end. 


Definition C.166. Let A be a unital C*-algebra. Then A is said to be: 


J, Finite-dimensional if it is finite-dimensional as a vector space. 

2. AF (Approximately Finite-dimensional) if it is the norm-closure of the union of 
some (not necessarily countable) directed set of finite-dimensional C*-subalgebras. 

3. Scattered if every a € Aga has countable spectrum. 

4. A W*-algebra if it is the dual of a Banach space, and a von Neumann algebra 
if it is a represented W*-algebra, i.e., A C B(H); this can always be achieved. 

5. Monotone complete if every upward directed bounded subset in Asa (under the 
usual order <) has a least upper bound (i.e. supremum). 

6. AW* if for each nonempty subset S C A there is e € P(A) so that R(S) = eA. 

7. Rickart if for each a € A there is a projection e € P(A) so that R(a) = eA. 

&. Real rank zero if its invertible self-adjoint elements are dense in Ago. 


Here a subset S of a poset P is upward directed if for each x,y € S there is z€ S 
such that x < z and y < z (for example, this is true in a complete lattice). 
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Furthermore, the right-annihilator R(S) of S C A is defined as 
R(S) = {a€A|ba=O0VdES}, (C.609) 


and R(a) = R({a}); in the presence of an involution, equivalent definitions may be 
given in terms of the left-annihilator. In all cases, the projection e is unique. Since 
Rickart himself already showed that A is Rickart iff for each nonempty countable 
subset S C A there is e € A(A) so that R(S) = eA, the difference between Rickart 
and AW* lies in the countability assumption on S in the former but not in the latter. 
It is known that if a C*-algebra A has a faithful representation on a separable 
Hilbert space, then it is a Rickart C*-algebra iff it is an AW*-algebra, but other- 
wise these classes are different. Similarly, an AW*-algebra is a W*-algebra iff it 
has a separating family of normal states, where normality of functionals on AW*- 
algebras is defined as in Definition 4.11, ie. through complete additivity on or- 
thogonal familes of projections, which always have an upper bound (cf. Theorem 
C.169 below). This is the case in all examples relevant to mathematical physics, but 
set-theoretically the class of AW*-algebras has higher cardinality than the class of 
W*-algebras it contains. It is generally believed that a C*-algebra is Rickart iff it is 
monotone o-complete, and that it is AW* iff it is monotone complete, but there are 
neither proofs of nor counterexamples to these claims. We have the inclusions: 


W* Cc monotone complete C AW* C Rickart C real rank zero; 
AF C real rank zero; 


scattered C real rank zero. 


Scattered C*-algebras may alternatively be characterized as those C*-algebras 
on which every state is a w*-convergent convex sum of pure states; this condition is 
far stronger than what the Krein—Milman theorem gives, namely that every state is 
a w*-limit of some net consisting of finite convex sums of pure states. For example, 
for any Hilbert space the compact operators Bo(H) form a scattered C*-algebra 
(extending the definition of the latter to the non-unital case as appropriate). 

Two kinds of results are of interest for Bohrification: one is the topological char- 
acterization of the commutative case of each class, the other is the characterization 
of the class itself through properties of its commutative subalgebras. Without proof 
we state what is known in this respect. 


Theorem C.167. Let A be a commutative unital C*-algebra. Then A is: 


1. Finite-dimensional iff © (A) is finite (with discrete topology). 

2. AF iff X(A) is a Stone space. 

3. Scattered iff X(A) is scattered. 

4. A W*-algebra or a von Neumann algebra iff (A) is hyperstonean. 
5. Monotone complete iff Z(A) is stonean. 

6. AW* iff X(A) is stonean. 

7. Rickart iff 2(A) is O-stonean. 

8. Real rank zero iff X(A) is a Stone space. 
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Here we used the convention that a Stone space is a zero-dimensional compact 
Hausdorff space (equivalently, it is compact Hausdorff and totally disconnected in 
the sense that the only connected subsets are points). A (O-) stonean space is a 
Stone space with the additional property that Clopen(Z(A)) is a (o-) complete lat- 
tice (equivalently, a stonean space is a compact Hausdorff space that is extremally 
disconnected in that the closure of each open set is open). Furthermore, a space is 
hyperstonean if it is stonean, and for any nonzero f € C(X,R*) there exists a com- 
pletely additive positive measure 1 such that u(f) > 0. In particular, in the com- 
mutative case the classes AF and real rank zero coincide, as do AW* and monotone 
complete algebras. A space X is called scattered if each non-empty closed subset 
C CX contains an isolated point (i.e., a point x € C with an open neighbourhood U 
such that U MC = {x}). If X is scattered, then it is totally disconnected. An example 
of a compact scattered space is (1/N) U {0} with the relative topology from R. 
This leads to the following generalization and extension of Theorem C.141. 


Theorem C.168. Let A be a commutative unital C*-algebra. The projections P(A) 
in A form a Boolean lattice, which is related to the Gelfand spectrum 2 (A) through 


P(A) = Clopen(Z(A)). (C.610) 
If A is also AF, then its Gelfand spectrum X(A) is a Stone space, and we have 


PIA)); (C.611) 
P(A)): (C.612) 
A=~C(SA(A(A))), (C.613) 


as topological spaces, frames, and (commutative) C*-algebras, respectively. 
Conversely, for any Boolean lattice L the C*-algebra C(.Y(L)) is AF, and 


L& AC(S(L))). (C.614) 
Proof. Using the Gelfand isomorphism A = C(Z(A)), eq. (C.610) follows from 
P(C(X)) = Clopen(X), (C.615) 


where X is some compact Hausdorff space. Indeed, if e7 = e* = e € C(X), then e 
must be {0,1}-valued, so it must be e = ly for some U C X, viz. U = e!({1}). 
Since e € C(X) is continuous, U must be clopen. Conversely, for each U € Clopen(X), 
the function 1y € C(X) is a projection, and the maps U ++ ly and e++ e~!({1}) are 
each other’s inverse. Theorem D.5 then implies that #(A) is Boolean. 

If A= C(X) is AF, then C(X) = (U,Aq)~ is the norm-closure of the union of 
finite-dimensional C*-algebras A,, which union by the Stone—Weierstrass theorem 
separates points of X. Since each Ay is the linear span of its projections, the finite- 
dimensional projections U;, Y(A,) already separate points in X, and this in turn 
implies that X is totally separated, i.e., for each x 4 y € X, there is U € Clopen(X) 
such that x € U and y ¢ U. Since a compact Hausdorff space is zero-dimensional 
(and hence Stone) iff it is totally separated, X is a Stone space. 
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Again using Theorem C.8, we only need to prove (C.611) in the special case 
X = S(LA(C(X))), (C.616) 


where X is a Stone space; this follows from (C.615) and Theorem D.5. Eq. (C.612) 
follows from (D.35), whilst (C.613) is immediate from (C.611) and Theorem C.8. 
Finally, using Theorem D.5 we see that (C.614) reduces to (C.615), so we only 
need to prove that C(X) is AF for any Stone space X. This is just the above proof 
of the converse ran backwards: since X is totally separated, for each x 4 y we find 
U € Clopen(X) separating x and y, so that also the associated projection ly separates 
x and y, and hence Y(C(X)) separates X. Taking A to label the finite subsets of 
P(C(X)), and Ay to be the finite-dimensional C*-algebra generated by A € A, ey 
Stone—Weierstrass we have C(X) = (U,A,)~. Hence C(X) is AF. 


Theorem C.169. The claim that a unital C*-algebra lies in class 2 iff each of its 
maximal abelian *-subalgebras lies in class 2 is true for the following classes: 


1. Finite-dimensional C*-algebras. 
2. Scattered C*-algebras. 

3. von Neumann algebras. 

4. AW*-algebras. 

5. Rickart C*-algebras. 


The claim is false for AF -algebras, true for monotone complete C*-algebras iff these 
coincide with AW*-algebras, false for real rank zero C*-algebras, and unknown for 
W*-algebras, which we therefore state as a conjecture: 


A C*-algebra is a W*-algebra iff each maximal abelian *-subalgebra is a W*-algebra. 


Proposition C.170. For any C*-algebra A and any projections e,f © AA), we 
have ef =e iff e < f (with partial ordering < as defined in Aga via AT, cf. §C.7). 


Proof. As explained above (C.93), if a, <a, then b*a,b < b*azb, so e < f implies 
(a —f)eda—f) < Ua—-f)f(la—f) =9. (C.617) 


However, since e* = e* = e, with c = e(14 — f) we have (14 — f)e(14 — f) = c*e, 
and hence (14 — f)e = 0 (as c*c > 0), or e = fe. Taking adjoints gives ef = e, and 
consequently ef = fe. Conversely, if ef =e, we have ef = fe and hence 


(f-e =f-2ete=f-e. (C.618) 


Of course, f —e = (f —e)*, so that (C.618) makes f — e a projection. Since any 
projection lies in A*, we have f —e > 0, and hence e < f. 


The set of projections A(A) in a C*-algebra is always a poset in the order <, but 
it is not automatically a lattice. It is a o-complete lattice if A is Rickart, and hence 
also in all “lower” classes, including von Neumann algebras (cf. Proposition C.136), 
where (A) is even a complete lattice. 
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Let A be a unital C*-algebra. As we know, the state space S(A) is the set of all 
states on A, seen as a compact convex set in the w*-topology inherited from the 
embedding S(A) Cc A* (note that S(A) fails to be compact if A lacks a unit). To see 
which information S(A) carries about A, we need to impoverish A as follows. 


Definition C.171. A Jordan algebra is a real commutative (but generally non-asso- 
ciative) algebra A whose product © satisfies (writing a2 = aoa): 


ao(boa’) = (aob)ca’. (C.619) 
A JB-algebra is a Jordan algebra that is also a (real) Banach space such that: 


|aob|| < |lal||ld||; (C.620) 
\lal|? < lla? +7 ||. (C.621) 


Given (C.620), axiom (C.621) is equivalent to ||a*|| < ||a? +b?|| and ||a?|| = |lal|?. 
It is easy to see that the self-adjoint part As, of any C*-algebra A is a JB-algebra if 
we put aob = $(ab+ba), cf. (5.14). If A and B are unital C*-algebras, we say that a 
linear map @ : Asa > Bgq is a Jordan homomorphism if it preserves 0; to this effect 
it clearly suffices that g(a) = @(a)* for each a. If — in addition is bijective, then it 
is called a Jordan isomorphism; in that case its inverse is necessarily linear and also 
preserves te Jordan product o. A JordanJordan automorphism of a C*-algebra A is 
a Jordan isomorphism A,, — Asa. Of course, we may complexify @ : Asa — Bsa so 
as to obtain a C-linear map @c : A — B that equally well satisfies @c¢(a*) = @c(a)*, 
this time for all a € A (rather than all a € Aga). However, the conceptual point here 
is that quantum-mechanical observables are supposed to be self-adjoint, and that 
the Jordan product (but not the ordinary associative product) always preserves self- 
adjointness. Generalizing Proposition 5.19, we then have the key result: 


Theorem C.172. Let A and B be unital C*-algebras. There is a bijective correspon- 
dence between Jordan isomorphisms @ : Asa — Bsa and affine homeomorphisms 
f : S(B) > S(A), given by f = 9* (i.e. f(@)(a) = @(@(a))). In particular, each 
affine homeomorphism of S(A) is induced by a Jordan automorphism of A. 


The proof is similar to Proposition 5.19; generalizing Lemma 5.20 we now have: 


Lemma C.173. Let A and B be unital C*-algebras. Then f = @* gives a bijective 
correspondence between affine bijections f : S(B) —> S(A) and unital positive linear 
bijections @ : Asa > Bsa. Moreover, if @ : Asa —> Bsa is a unital linear bijection, then 
@ is positive iff @ is isometric iff 9 is a Jordan isomorphism. 


Most of the proof is practically the same as for Lemma 5.20 (so we omit it), ex- 
pect for the last equivalence between invertible unital isometries and Jordan isomor- 
phisms, which is deeper and relies on Kadison’s inequality p(a*a) > @(a)* p(a) for 
positive unital linear maps @ between C*-algebras and normal operators a. 

A similar result is Hamhalter’s generalization of Dye’s Theorem to AW*-algebras: 
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Theorem C.174. Let A and B be AW*-algebras and let N: P(A) > A(B) be an 
isomorphism of the corresponding orthocomplemented projection lattices that in 
addition preserves arbitrary suprema. If A has no summand isomorphic to either C? 
or Mz(C), then there is a unique Jordan isomorphism J: Asa —> Bsa that extends N 
(and hence Jordan isomorphisms are characterized by their values on projections). 


This generalizes Corollary 5.22 in the main text, but has a much more difficult proof. 


Proof. If e,f € A(A) are orthogonal, then so are N(e) and N(f), so that 

N(e+ f) =N(e)+N(f). (C.622) 
Gleason’s Theorem for AW*-algebras then gives a Jordan homomorphism 

Jef) : AW" (e, fsa > Bsa, (C.623) 


where AW*(e, f) is the AW*-algebra generated by e, f, and the unit 14, which in 
particular preserves all Jordan triple products 


{a,b,c} = (aob)oc+ao(boc)—bo (aoc), (C.624) 
which in terms of the usual operator product equals } (abc + cba). This implies 
N((14 —2e) (1a —2e)) = (1p —2N(e))N(F)(Ip—2N(e)), (C625) 


which (in the second major step of the proof, after the application of Gleason’s 
Theorem) is necessary and sufficient for @ to extend to a Jordan isomorphism. 


The structure of Jordan isomorphisms may be inferred from the following re- 
markable result, in which a linear map @ : A — B between C*-algebras is called an 
anti-homomorphism of ~(a*) = @(a)* as usual, but p(ab) = o(b)@(a). 


Theorem C.175. If @ : Asa > B(H)sa is a Jordan homomorphism (where A is a C*- 

algebra and H is a Hilbert space), there exist three mutually orthogonal projections 

€1, €2, €3 in the center P(A)’ 9A)” of the von Neumann algebra p(A)", such that: 

1. ey tex +e3 = 14; 

2. The map a+ @c(a)e1 from A to B(e,H) is a homomorphism (of C*-algebras). 

3. The map a> @c(a)e2 from A to B(e2H) is an anti-homomorphism (ibid.). 

4. The map a> @c¢(a)e3 from A to B(e3H) is both a homomorphism and an anti- 
homomorphism of C*-algebras (so that @c(A)e3 is commutative). 


If in addition a+ @c(a)e, is not an anti-homomorphism and a> @c(a)e2 is not a 
homomorphism, then e,, ez, and e3 are uniquely determined by these conditions. 


Like the previous theorem, the proof of this one exceeds the scope of this book. 


Corollary C.176. Let J: B(H)s, + B(H)sa be a Jordan isomorphism. Then Jc : 
B(H) > B(A) is either a homomorphism or an anti-homomorphism of C*-algebras. 


Proof. The center of B(#) is trivial, so either e; = ly or eo = Ly. 
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The pure state space P(A) = 0-S(A) is the extreme boundary of the state space S(A). 
According to the Krein-Milman Theorem B.50, P(A) is not empty, and 


S(A) = (codeP(A))~, (C.626) 


see (B.165) for notation. In order to recover S(A) from P(A), the latter obviously 
needs more structure than just that of a set. First, it inherits the w*-topology from 
A*, but it turns out that we need to equip P(A) with the more refined w*-uniformity. 

In general, a uniform structure on a set X (also called an entourage uniformity) 
is a nonempty filter Y% on X x X (i.e.,a collection YC A(X x X) of subsets of 
X x X such that U € YW andU CV imply VE Y, andU € Y and V € Y imply 
UNV € &) satisfying the following conditions: 


1. Each U € Y contains the diagonal Ay = {(x,x) |x € X}; 
2. f#UCY, then U? € Y, where U? = {(y,x) | (x,y) €U}; 
3. IfU € Y, then there is some V € Y such that V2 C U, where 


V? = {(x,z) | Sy EX: (x,y) EV, (,z) € V}. (C.627) 


A set with a uniformity is called a uniform space. If X and Y are uniform spaces, a 
function f : X + Y is uniformly continuous if f~'(V) € Y whenever V € WY. 

The w*-unformity %,* on A*, where A is any Banach space, is the smallest one 
containing all subsets of the type 


{(9,') €AxA: |e(a)— 9'(a)| < €}, (C.628) 


where a € A and € > 0; this implies that U € %,» iff U contains some such subset. 
Second, P(A) carries a natural transition probability, cf. Definition 1.17 and 
(2.43). For @, o' € P(A), this function t : P(A) x P(A) — [0,1] is defined by 


t(@, @') = inf{@(a) |a€A,0 <a < 14,@'(a) = 1}. (C.629) 
This definition, and the following result, are valid even if A has no unit. 
Proposition C.177. Let A be C*-algebra and define Tt by (C.629). Then 
t(@,@') =1—!||o—a' ||’, (C.630) 


and the following dichotomy applies: 


1. If @ and @’ are equivalent (in the sense that the corresponding GNS-representations 
Tw and Ny are unitarily equivalent), so that we may assume that the associated 
cyclic vectors Qe and Qy lie in the same Hilbert space, we have 


t(@, 0’) = Tr(ea,€a,7) = |(Qo; Qe)’: (C.631) 
2. If @ and @! are inequivalent (in that Hg and Ny are inequivalent), then 


t(@,@') =0. (C.632) 
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Proof. We first show that (C.630) yields (C.631) and (C.632). In the first case, 


sup{|@(a) — w'(a)|,a € A, ||a|| = 1} 

= sup{| (Qo, To (a) Qo) — (Qe, Toa) Qw/)|,4 € A, |lal] = 1} 

= sup{|Tr ((ea, — €a,,)%o(a))|,4 € A, |la|| = 1} 

= sup{|Tr ((ea, — €0,,)4)|,4 € %o(A), |la|| = 1} 

= sup{|Tr ((ea, — €a,,)4)|,4 € B(Ha), |lal| = 1} 

= |leay — ea,yli, (C.633) 


|o— a | 


where || - ||; is the trace norm on B, (Hq). In the fifth step we used the fact that the 
map a++ Tr (ba) is o-weakly continuous for any b € B| (Hq), so that we may replace 
the supremum over a € (A) by the supremum over a in the o-weak closure of 
To(A) which by the Theorem C.130 is 2@(A)”, which in turn is B(H») because 
Tw(A) is irreducible (since @ is pure, cf. Theorem C.90). The last step then follows 
from Theorem B.146. To compute the last expression in (C.633), we assume that 
Qe and Q,y are not proportional (if they are, then @ = @’, so that (C.630) reduces 
to 1 = 1, and hence holds). We may then work in the 2-dimensional Hilbert space 
spanned by Q@ = (1,0) and Qe = (c1,c2), with |c1|? + |c2|? = 1. In that case, 


(Cay — €Qy)° = le2/?+ 1a: (C.634) 
lene — ay! = eae ea, = |c2|- 12; (C.635) 
lleag — Cay lt = Tr (leag — eayl) = 2lcal- (C.636) 


Using (C.633), this gives 
1=4|]@— a! |? =1— jleag —eayyllt = 1 eal? = ler? =|(Qo,Qa1)|?. (C.637) 
To deal with the second case, we use the following version of Schur’s Lemma: 


Lemma C.178. Let 1 and Ny be irreducible representations of some C*-algebra 
A, and let w: Hm — Hw be an intertwiner, i.e., a bounded linear map that satisfies 


Wo(a) = Ne (a)w (a EA). (C.638) 


e@ If Xq and Ny are equivalent, then w is either zero or invertible. 
© If Ky» and Ny are inequivalent, then w is zero. 


Proof. The proof is the same as for group representations: taking the adjoint of 
(C.638), it follows that w*w € W@(A)’ and ww* € 7(A)’, so by Theorem C.90 (i.e. 
the mother of all Schur’s lemma’s) we have w*w = A-1y,, and ww* = w- 1 Hyp 
for some A,p € Rt (since w*w and ww* are positive operators). Moreover, since 
w = Aww*w = uw, in fact we have A = uw whenever w # 0. If A > 0, then the 
operator (A)~!/2w: Hg > H,y is a unitary intertwiner, so 2» and 7, are equivalent. 
If A =0, then w*w = 0 and hence w = 0, since ||w*w]| = ||w]|?. 
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Continuing the proof of (C.632), we form the direct sum 


(A) = %o(A) @ %e(A); (C.639) 
H =Ho@Hy- (C.640) 


The second case of Lemma C.178 then gives 
(H(A) ® He (A))! = Ho (A)! B Hey (A)’, (C.641) 
whose right-hand side consist of operators 2-14, ®U- 1 Hot (A, pu € C), so that 
(Ta(A) @ Me (A))” = Ho(A)” @ My (A)” = B(Ho) 6 B(Ha)- (C.642) 
Once again using Theorem C.130, a computation a la (C.633) therefore gives 
||o — o'|| = sup{|Tr (ea, — €a,,)a)|,4 € B(Ho) © B(Hgy); |la|| = 1} 
= sup{|Tr (ea,,@) —Tr(ea,a')|,a € B(Hq),a’ € B(H,), laa’ || = 1} 
= sup{|Tr (¢a,4)|,4 € B(Ho), |lal| = 1} 


+ sup{|Tr(ea,,4’)|,a' € B(Ha), ||a’|| = 1} 
= |leag|lit+fleo, Ih =14+1=2, (C.643) 


since the trace may be computed in a basis of Hw © Hg consisting of a basis of Hg 
and a basis of H,y, and ||a@a'|| = max{|la||, ||a’||} fora € B(H@) and a’ € B(A,y). 
Finally, we prove that (C.629) and (C.630) coincide. If @ and @’ are equivalent, 


1(@, 0") = inf{Tr (eo, %o(a)) |a €A,0 <a < 1a, Trea, Mo(a)) =1} (C.644) 


and, as in (C.633), Theorem C.130 allows us to replace the infimum over a € A by 
the one over a € B(Hq). The claim then follows from Theorem 2.12 and eq. (C.631). 
Similarly, if @ and @’ are inequivalent, eq. (C.642) and Theorem C.130 give 


1(@,') = inf{Tr(eo,4) |a € B(Ho) ® B(Hy),0 <a < 14, Tr(eo,a) = 1}, 


and notice that the infimum zero is reached by a=0-1y, 1 Hol: 
The final result of this appendix, then, is the “pure” counterpart of Theorem C.172: 


Theorem C.179. Let A and B be unital C*-algebras. There is a bijective corre- 
spondence f = ~* between Jordan isomorphisms @ : Asa — Bsa and bijections 
f : P(B) — P(A) that preserve transition probabilities and are w*-uniformly contin- 
uous along with their inverse. In particular, @ : Asa — Aga is a Jordan automorphism 
of A iff p* : P(A) — P(A) has the properties just stated for f. 
The proof of this theorem is far more difficult than Theorem C.172, so we omit it. 
If A~C(X) and B =C(Y) are commutative, we obtain a variation on Corollary 
C.22 featuring uniform homeomorphisms. Also, we see from Wigner’s Theorem 
5.4.1 that for A = B(H) it is enough to consider normal pure states, in which case 
also the (uniform) continuity condition on f is superfluous. 
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Notes 


As already mentioned in the Introduction, the theory of operator algebras on Hilbert 
spaces was created by von Neumann, partly in collaboration with his assistant Mur- 
ray (von Neumann, 1930, 1931, 1938, 1940, 1949; Murray & von Neumann, 1936, 
1937, 1943, reprinted in von Neumann, 1961). His motivation for doing so certainly 
included quantum mechanics, but also functional analysis, measure theory, ergodic 
theory, and representation theory, all of which fields in turn benefited from their 
interaction with operator algebras. Von Neumann (and Murray) studied what they 
called “rings of operators”, which are now deservedly called von Neumann alge- 
bras. John von Neumann (1903-1957) was one of the greatest mathematicians in 
history, especially considering the totality of his oeuvre in pure and applied mathe- 
matics (including numerical mathematics, computer science, and mathematical eco- 
nomics). His work in mathematical physics, notably on the mathematical structure 
of quantum mechanics, in some sense forms a bridge between the two. 

Von Neumann was a Hungarian prodigy; he wrote his first mathematical paper at 
the age of seventeen. Except for this first paper, his early work was in set theory and 
the foundations of mathematics. In the Fall of 1926, he moved to Gottingen to work 
with Hilbert. Around 1920, Hilbert had initiated his Beweistheory, an approach to 
the foundations of mathematics whose specific technical goals were not achieved 
because of Gédel’s work, but whose overall view of mathematics (i.e. as an activity 
whose correctness is to be established purely syntactically and whose meaning is a 
semantic matter to be distinguished from its syntax) still reigns. However, at the time 
that von Neumann arrived, Hilbert was also interested in quantum mechanics. Apart 
from his broad interest in general (mathematical) physics (for example, his Sixth 
Problem from 1900 called for the mathematical axiomatization of physics), Hilbert 
was specifically attracted to quantum mechanics because G6ttingen was, next to 
Copenhagen, a leading center for research in this area. Indeed, Heisenberg’s (1925) 
paper initiating quantum mechanics (at least in its preliminary guise of “matrix me- 
chanics”) was followed by the Dreimdnnerarbeit of Born, Heisenberg, and Jordan 
(1926), and all three were in Gottingen at the time. Born was one of the few physi- 
cists of his day to be familiar with the concept of a matrix; in previous research he 
had even used infinite matrices. Born turned to his former teacher Hilbert for math- 
ematical advice. Aided by his assistants Nordheim and von Neumann, Hilbert thus 
ran a seminar on the mathematical structure of quantum mechanics, and the three 
wrote a joint paper on the subject (which is now exclusively of historical value). 

It was von Neumann (1927ab) who, at the age of 23, discovered the mathematical 
structure of quantum mechanics. In this process, he defined the abstract concept of 
a Hilbert space, which previously had only appeared in examples that went back to 
the work of Hilbert and his pupils on integral equations, spectral theory, and infinite- 
dimensional quadratic forms. Hilbert’s famous memoirs on integral equations had 
appeared between 1904 and 1906; in 1908, his student Schmidt had defined the 
space ¢* in the modern sense, and F. Riesz had studied the space of all continuous 
linear maps on @? in 1912. Various examples of L?-spaces had emerged around the 
same time (with hindsight, Hilbert himself mainly worked with the unit ball of ¢7). 
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However, the abstract notion of a Hilbert space was missing until von Neumann 
provided it. In particular, von Neumann saw that Schrédinger’s wave functions were 
unit vectors in a Hilbert space of L? type, and that Heisenberg’s observables were 
linear operators on a different Hilbert space, of (7 type. A unitary transformation 
between these spaces provided the the mathematical equivalence between wave me- 
chanics and matrix mechanics. Moreover, von Neumann developed the spectral the- 
ory of bounded as well as unbounded normal operators on a Hilbert space. This work 
culminated in his book Mathematische Grundlagen der Quantenmechanik (1932). 

Despite the tremendous prestige of von Neumann, initially few mathematicians 
recognized the importance of his subsequent theory of operator algebras. For exam- 
ple, after a lecture by von Neumann on operator algebras in the weekly mathematics 
colloquium at Harvard sometime in the 1930s, G. H. Hardy, one of the leading math- 
ematicians of his time, is reported to have said:! 


“He is quite clearly a brilliant man, but why does he waste his time on this stuff?” 


Fortunately, among those who did study operator algebras were Gelfand & Naimark 
(1943), who linked the subject to Gelfand’s earlier work on (commutative) Banach 
algebras and in doing so created the theory of C*-algebras. This, in turn, was picked 
up by Segal (1947ab), who thereby also restored the link with quantum theory. 

A survey of von Neumann’s mathematical work is given in Oxtoby et al (1958), 
which contains a biographical introduction by von Neumann’s friend and colleague 
Ulam, and some of von Neumann’s correspondence is collected in Rédei (2005b), 
which also contains a short mathematical biography. One of the most insightful 
documents about von Neumann is the rare manuscript Vonneumann (1987) by his 
brother Nicholas, of which the author got a copy from von Neumann’s only PhD stu- 
dent Israel Halperin, who visited Cambridge on a peace mission in the early 1990s.” 
Politically, von Neumann was a controversial figure because of his enthusiastic con- 
tributions to nuclear weapons and the arms race between the USA and the Soviet 
Union; see Heims (1980) and Macrae (1992) for different perspectives on this. A 
substantial scholarly scientific biography of von Neumann remains to be written. 

The history of operator algebras (i.e. von Neumann algebras and C*-algebras, 
which terms were probably introduced by Dieudonné and Segal, respectively) has 
been described in Kadison (1982), Doran & Belfi (1986), and Doran (1994). 

Leading textbooks on operator algebras, written by some of the original contrib- 
utors, are Neumark (1968), Sakai (1971), Dixmier (1977, 1981), Pedersen (1979), 
Kadison & Ringrose (1983, 1986), and Takesaki (2002, 2003a, 2003b). See also 
Murphy (1990), Li (1992), Davidson (1996), Blackadar (2006), and the remarkable 
lectures on von Neumann algebras by algebraic topologist Lurie (2011). Connes 
(1994), written by arguably the greatest contemporary mathematician working in 
operator algebras, also provides innumerable fascinating insights into the subject. 


' Reported by G.D. Birkhoff (who overheard Hardy saying this) to his son, Garrett Birkhoff, who 
in turn mentioned it to G.C. Rota, who wrote it down in the Introduction to Stern (1991). 

> According to Rhodes (1996, pp. 245-246), Halperin was a spy for the Soviet Union, although his 
evidence seems limited to the fact Halperin was arrested in 1946 suspected of espionage, having 
Klaus Fuchs in his address book. 
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§C.1.Basic definitions and examples 

As in the notes to the previous appendix, we only comment on results whose 
origins are less well known or which are less standard by themselves, the rest be- 
longing to the foundations of the field as described in the textbooks just mentioned. 
Once again, for this reason not all sections in this appendix come with notes. 
$C.2. Gelfand isomorphism 

The implication @ € Y(A) > @(a) € o(a) (a € A) in the proof of Lemma C.9 
also holds in the oppositie direction (given that A is a Banach algebra with unit and 
@:A — Cis linear); this is the Gleason-Kahane-Zelazko Theorem (Sourour, 1994). 
A recent monograph about C(X) is Groenewegen & van Rooij (2016), following up 
on earlier books like Semadeni (1971) and Gillman & Jerison (1976). 


$C.3.Gelfand duality 

Proposition C.19 is due to Gelfand & Kolmogorov (1939). In the spirit of the 
proof of the Stone—Weierstrass Theorem B.51 in 8B.10, let us give an alternative 
proof of this proposition (Simon, 2011), which is based on Proposition C.14 and 
Corollary B.17. These identify 2(C(X)) with the set 0,Mj(X) of extreme com- 
pletely regular probability measures on X, provided we identify the latter with the 
corresponding functionals on C(X), as in (B.39). That is, we must prove that the 
map x ++ 6, (i.e., the Dirac measure at x, which, seen as a functional on C(X), is just 
the evaluation map ev;,) is a bijection. 


Proof. We first show that a measure ft € 0.M;'(X) must satisfy u(A) = 1 

L(A) = 0 for any A € &. For if there is some C € © for which 0 < u(C) < 1 

we have a nontrivial convex decomposition W = tH; + (1 —t)Uo, namely t = y(C), 

(A) = WlAIC) (ie, 2(ANC)/w(C)), and pip(A) = 4(A\C)/ju(X\C). From this, 

we show that supp(iZ) is a point. Indeed, if both x and y 4 x would lie in supp(w), 

we could separate these with disjoint open sets x € U and y € V. This would leave 

four (im)possibilities: 

e w(U) =u(V) =1 would imply u(X) > 2, contradicting u(X) = 1; 

e u(U) =0 would make USM supp() a proper closed subset of supp() whose 
open complement has measure zero, contradicting the definition of supp(); this 
applies to all four cases u(V) = 0, u(V) = 1, u(U) = 1, and uw (V) =0. 


Thus supp(i1) = {x} for some x € X,ie., up = 6,, so that oe-M (X) CX. Finally, we 
also have X C o-M (X), since 6, = tu; + (1 —t) Up forces 


supp(H1) = supp(U2) = {x}, (C.645) 


and hence [; = Lz = 6x. 


In the unital/compact case, categorical Gelfand duality was first established in 
Negrepontis (1969, 1971), and was reproved in a different way by Johnstone (1982). 
Our proof of is taken from Landsman (2004), with some improvements in the non- 
unital case due to Brandenburg (2015), but it should be considered “folklore”’. 

In the smooth case, Corollary C.22 is often called Milnor’s exercise. The result 
even holds without the second countability assumption on the manifold X, but with 
a completely different proof (Mréun, 2005). See also Burtscher (2009). 
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$C.6. C*-algebras without unit: commutative case 
For proper maps see e.g. Bourbaki (1989), 81.10. 


§C.10. Hilbert C*-modules and multiplier algebras 

The theory of Hilbert C*-modules goes back to Kaplansky, Paschke, and Rieffel. 
See Lance (1995) and Raeburn & Williams for textbook coverage, and Landsman 
(1998a) for applications to mathematical physics (e.g. constrained quantization). 

Theorem C.76 is due to An Huef, Raeburn, & Williams (2010). 

The Cohen—Hewitt Factorization Theorem a \a Fell & Doran (1988), Theorem 
v.9.2, adapted to C*-algebras, states that if A and B are C*-algebras and @ : A — B) 
is a homomorphism, then {a@(a)b | a € A,b € B} is a closed linear subspace of B. 
Consequently, if @ is nondegenerate, then each element c € B factors as c = a(a)b. 
In particular, taking B = A and @ to be the identity, we see that Lemma C.47 may 
be sharpened to the claim that any c € A takes the form c = ab for suitable a,b € A. 


§C.11. Gelfand topology as a frame 

Our treatment of frames and locales has been borrowed from Mac Lane & Mo- 
erdijk (1992), where also the details of the proof of Theorem C.80 may be found. 
See also Picado & Pultr (2012). Hereditary subalgebras are discussed e.g. in Peder- 
sen (1979) and Blackadar (2006). 

The fact that H(A) forms a complete lattice was noted by Akemann & Bice 
(2014), who also pursued the analogy with open sets, though not in a frame-theoretic 
setting. The theory is still disappointing in various ways, most notably in the fact 
that H(A) fails to be a frame unless A is commutative. Also, Theorem C.86 has (so 
far) been proved by conventional means, i.e., via the Gelfand isomorphism; it would 
be preferable to prove it purely algebraically (and if possible constructively). 

From a localic point of view, the Gelfand transform 4: 2(A) + C of a€ A should 
primarily be described as the corresponding frame map d~!: @(C) + @(Z(A)), and 
hence, using Corollary C.84, as a frame map 


a@!:6(C) H(A). (C.646) 


Denoting the hereditary subalgebra generated by a by Hp, i.e., the closure of a-A, 
for U € G(C)) we obtain a nice formula whose use remains to be established: 


mes) om (es eee (C.647) 
zeEC\U 


A direct proof of the last claim of Proposition C.82 uses the property H(A) = 1(A) 
(in the commutative case), the identification of JJ with (IJ)~ (ie., the closure of 
the linear span of all ab, a € I, b € J, which follows by taking an approximate unit 
in J or J), and the identification of \/ S with the closure of the linear span of US. 
§C.13. Tensor products of Hilbert spaces and C*-algebras 

For the proof of (C.248) see Reed & Simon (1972), Theorem II.10. 

For tensor products of C*-algebra we mainly relied on Lance (1982), Li (1992), 
Wegge-Olsen (1993), and Takesaki (2002), by one of the founders of the theory. 
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Tensor products of Banach spaces and Hilbert spaces were first studied by Schatten 
(1946) and Schatten & von Neumann (1946, 1948). The subject was subsequently 
taken up by Grothendieck (1955) for locally convex spaces, and hence involves two 
of the greatest mathematicians of the twentieth century. Nuclearity of C*-algebras 
is a vast and important field, to which Takesaki (2003) is a good introduction. 

Yet another expression for the maximal C*-norm on A ® B arises if we say that 
two representations 7, : A + B(H) and 7 : B > B(H) on the same Hilbert space H 
commute if ™,(a)%p(b) = mp(b)a,4(a) for all a € A and b € B. Such a pair defines a 
representation 74 ® 7p of A® B by 


T ® Mp(c mae 14 (aj) ® ™p(bi), (C.648) 


which makes sense because (a,b) +> 2(a)2(b) is bilinear and hence (by universality 
of ®) factors through A @ B. This gives a third formula for || - || max, namely 


[el] max = sup{]||% @ AB (C) lace Bee) I> (C.649) 


where 7,4 and 7g run through all commuting representations of A and B. Indeed, 
the restrictions of any representation of A ® B to A and B define commuting repre- 
sentations, so that although at first sight the expression (C.649) appears to majorize 
(C.265), it must be equal to it in view of the equality of (C.265) and (C.263). 

The name projective tensor product for A®maxB, where A and B are C*-algebras, 
is actually confusing, since if A and B are regarded as Banach algebras, their pro- 
jective tensor product is usually defined as the completion of A & B in the norm 


lle || proj = ot \lai||||Dill,¢ = Ean} ; (C.650) 


cf. (C.259), which is defined for any two Banach algebras A and B. This may not be 
a C*-norm, and hence A®projB may not be a C*-algebra. However, for any Banach 
algebra C with involution, one may canonically construct a C*-algebra C* (sic) 
and a homomorphism @ : C — C* of involutive Banach algebras, with the universal 
property that for any morphism B : C > D, where D is a C*-algebra, there is a 
unique homomorphism f’ : C* —> D of C*-algebras such that B = Bo @. This C*- 
algebra C*, which by the usual argument is unique up to isomorphism, is called the 
C*-envelope of C. An explicit construction is obtained by completing C in the norm 


lel] = sup{||7(c)|I}, (C.651) 


where the supremum runs over all representations of C on Hilbert spaces; it is fi- 
nite since ||7(c)|| < ||c|| for each c € C, see Dixmier (1977), §1.3.7 and §2.7. It is 
easy to see that || - ||proj is a cross-norm on A & B, and that one has a bijective corre- 
spondence between representations of A ® B that satisfy ||7(a@ b)|| < ||a||||b|| and 
representations of AQ®projB. The point, then, is that one has A®maxB = (A®projB)* 
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C*-algebras (with homomorphism) and ®max form a monoidal category (also 
called a tensor category), with commutative C*-algebras as a full subcategory CCA. 
The map X +» Co(X) then defines a duality as monoidal categories between the cat- 
egory LCHp of locally compact Hausdorff spaces and proper continuous maps (with 
cartesian product as a tensor product) and the category CCAn of commutative C*- 
algebras and nondegenerate homomorphisms (with its unique C*-algebraic tensor 
product, for example realized as ®max). Cf. Theorem C.45. See Hofmann (1970). 


§C.14. Inductive limits and infinite tensor products of C*-algebras 

For inductive limits of C*-algebras see in particular Sakai (1971); they were orig- 
inally a Japanese invention (Takeda). Infinite tensor products of operator algebras 
(which partly motivated inductive limits) go back to von Neumann (1938). Bounded 
monotone nets converge under very general conditions; see McArthur (1970). 


§C.15. Gelfand isomorphism and Fourier theory 

For details on the Haar measure and for the proof of local compactness of G see 
Weil (1965), §27. Our approach to the Fourier transform is largely taken from Deit- 
mar & Echterhoff (2009), where complete proofs may be found (though we some- 
times followed a slightly different approach). In particular, these authors introduced 
the Banach spaces Cj(G) and C3 (G), whose use forms a marked improvement over 
older and less elegant treatments, as in e.g. Rudin (1962) or Folland (1995). 

Eq. (C.379) is often called Plancherel’s Theorem. 

We may add a third entry to the ‘symmetric’ isomorphisms (C.379) - (C.380). 
The Bruhat space ./(G) of rapidly decreasing functions on G is defined by 


A(G) = {f €L°(G) | AK € #(G)Vn > OAC, > OVE > 05 lI figy elles S Cok" Fs 
S(G) ={f €L*(G) | f €A(G), f €A(G)}. 


For G = R this recovers the usual test functions .“(R) (cf. Definition 5.64), where 
the condition f € A(R) gives rapid decrease whereas f € A(R) gives smoothness. 
Pontryagin duality then yields an isomorphism .7(G) = .Y(G) (Osborne, 1975). 

The author originally learnt the SNAG-Theorem from Barut & Racka (1977), 
whose proof (due to K. Maurin) is quite different; the argument given above was 
inspired by the treatment of projection-valued spectral measures in Conway (2007, 
Ch. 9, §1), who calls them spectral measures. Conway also proves our Theorem 
C.113 as his Theorem 1.14, albeit for the case where X is compact; passage to the 
locally compact case may be done through unitization, as in §C.6. The need for z to 
be non-degenerate may then be traced back to (our) Lemma C.43. 


§C.16. Intermezzo: Lie groupoids 

For introductions to Lie groupoids see Moerdijk & Mréun (2003) or Mackenzie 
(2005), who also described the link with symplectic geometry. For their use in non- 
commutative geometry and mathematical physics cf. Connes (1994) and Landsman 
(1998a, 2006b), respectively. The tangent groupoid was invented by Connes, with 
further contributions by Hilsum & Skandalis (1987), Weinstein (1989) and Lands- 
man (1998a). See also Connes (1994), Landsman (2003), Higson (2010), and van 
Erp (2010) for applications of the tangent groupoid to index theory. 
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§C.17. C*-algebras associated to Lie groupoids 

C*-algebras associated to locally compact groupoids (with Haar system) were 
first studied in detail by Renault (1980). Originally in the setting of foliation theory, 
the Lie (i.e. smooth) case was pioneered by Connes (1994), who noted in particular 
that Lie groupoids carry an intrinsic Haar system, and gave many interesting exam- 
ples. The uniqueness of C*(G) for Lie groupoids G, i.e., the independence of the 
underlying left Haar system (up to isomorphism) is proved in Paterson (1999). 


§C.18. Group C*-algebras and crossed product algebras 
The locus classicus is Pedersen (1979), but Williams (2007) may even be better. 


§C.19. Continuous bundles of C*-algebras 

The bundles studied in this section were originally introduced by Fell (1961) and 
their theory was further developed by Dixmier & Douady (1963); see also Dixmier 
(1977), Fell & Doran (1988), and, for a modern treatment, Raeburn & Williams 
(1998). Lemma C.125 was part of Dixmier’s definition of a continuous field of C*- 
algebras, before it was recast into the rather more appealing Definition C.121 by 
Kirchberg & Wassermann (1995) and Blanchard (1996). Theorem C.123 is due to 
Landsman & Ramazan (2001); see also Landsman (1998a) for a detailed discussion. 
Aastrup, Nest, & Schrohe (2006) discuss applications to manifolds with boundary. 


§C.20. von Neumann algebras and the 0-weak topology 
There are many other topologies on von Neumann algebras, se e.g. Takesaki 
(2002), Chapter I. In any case, we only scratch the surface of the subject. 


§C.21.Projections in von Neumann algebras 

The first part of the proof of Theorem C.141 is taken from Rédei (1998), Prop. 
4.16. The remainder is adapted from Heunen, Landsman, & Spitters (2012). The de- 
tails of the proof of Theorem C.140 may be found in Takesaki (2002), Thm. 1.1.18; 
see also Dixmier (1981), Ch. 7 and Lurie (2011), lectures 13-17. 


§C.23. Classification of hyperfinite factors 

This material, which is a high point in modern mathematics, is explained in great 
detail in Takesaki (2003ab). See also Wright (1989) for the uniqueness of the hy- 
perfinite 111, factor. In his review MR1030046 (91a:46059) of the latter book for 
Mathematical Reviews in 1991, E. St@rmer wrote: 


‘At the time of writing this review, by far the deepest and most difficult proof in von Neu- 
mann algebra theory is the one of Connes and Haagerup on the uniqueness of the injective 
factor of type 111; with separable predual.’ 


The applications of C*-algebras and von Neumann algebras to quantum field the- 
ory are reviewed in Haag (1992), where the identification of the unique hyperfi- 
nite 111; factor with local algebras of observables may be found in §V.6. This book 
also explains the relationship between Tomita—Takesaki theory and quantum sta- 
tistical mechanics, as do Bratteli & Robinson (1981). It should be mentioned that 
the Tomita—Takesaki theory, including the modular group (i.e. of time translations) 
has a classical analogue in Poisson geometry (Weinstein, 1997), which somewhat 
softens the spectacular claim by Connes & Rovelli (1994) that time has a quantum- 
mechanical (or non-commutative) origin related to thermodynamics. 
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§C.24. Other special classes of C*-algebras 

The classic reference on AW*-algebras and Rickart C*-algebras is Berberian 
(1972). For monotone complete C*-algebras see the monograph by Sait6 & Mait- 
land Wright (2015b). Real rank zero was introduced by Brown & Pedersen (1991), 
who also proved that the definition of real rank zero in the main text may be replaced 
by an equivalent property that is often taken as the definition: 


Proposition C.180. Let A be a unital C*-algebra. Then rr(A) = 0 iff the set of self- 
adjoint elements with finite spectrum is dense in Aga. 


See Davidson (1996), Theorem V.7.3, for a streamlined proof. 

Scattered C*-algebras were independently introduced by Jensen (1977) and Hu- 
ruya (1978). The results in the main text are due to Kusuda (2011). 

Theorem C.167.1 should be obvious. No. 2 is due to Kusuda (2011), no. 3 may 
be found in Takesaki (2002), 8III.1, no. 4 is (a restatement of) Theorem 2.3.7 in 
Sait6 & Maitland Wright (2015b), no. 5 is Theorem 1.7.1 in Berberian (1972), no. 
6 is Theorem 1.8.1 in the same reference, no. 7 is from Sait6 & Maitland Wright 
(2015a), and finally no. 8 may be found in Blackadar (1994), 86.1.3. 

Theorem C.169.1 is Exercise 4.6.12 in Kadison & Ringrose (1983); it should 
be hidden from students that the AMS published two volumes with the answers to 
all their exercises! No. 2 is in Kusuda (2011), no. 3 is in Pedersen (1972), no. 4 
is (a restatement of) Theorem 8.2.5 in Sait6 & Maitland Wright (2015b), and no. 
5 easily follows from Corollary 2.7 in Sait6 & Maitland Wright (2015a). See also 
Lindenhovius (2016), where results of this kind are used to study the invariant @ (A). 


§C.25. Jordan algebras and (pure) state spaces of C*-algebras 

Theorem C.172 is Corollary 4.20 in Alfsen & Shultz (2001), based on Kadison 
(1951). See also Roberts & Roepstorff (1969). Theorem C.174 is due to Hamhalter 
(2015); the second step in the proof had been given earlier by Heunen & Reyes 
(2014). A complete proof of Lemma C.173 may be found in Bratteli & Robinson 
(1997), Theorem 3.2.3. In particular, Kadison’s inequality is Proposition 3.2.4 in the 
same book. Theorem C.175 is the culmination of a long chain of argument, starting 
with Jacobson & Rickart (1950) and ending with Thomsen (1982). See also Bratteli 
& Robinson (1987), Theorem 3.2.3. 

The formula (C.629) was proposed by Mielnik (1968, 1969). Otherwise, case | 
of Proposition C.177 is due to Roberts & Roepstorff (1969), who state case 2 with- 
out proof, referring to Glimm & Kadison (1960). Theorem C.179 is due to Shultz 
(1982). A completely different proof of the last claim, based on a reconstruction of 
A from P(A), appears in Landsman (1998a), §I.3. Both authors add further structure 
to P(A) to make it an invariant for A as a C*-algebra, viz. an orientation and a Pois- 
son structure, respectively. The notion of an orientation was originally introduced 
by Alfsen & Shultz in order to make S(A) a complete invariant for A; see their final 
work Alfsen & Shultz (2001, 2003). 


Appendix D 
Lattices and logic 


In this appendix we collect some basic material from the theory of lattices, includ- 
ing Stone’s representation theorem for Boolean lattices and the connection between 
Boolean (Heyting) lattices and classical (intuitionistic) propositional logic. In prepa- 
ration for Appendix E, we also provide an introduction to first-order logic. 


D.1 Order theory and lattices 


One hopes that the reader has seen some of the following concepts before! 


Definition D.1. /. A preorder on a set X is a subset RC X xX (i.e., a relation on 
X), where we write x < y or y > x iff (x,y) © R, such that x <x, and x < y and 
y <zimply x < z. A preorder is a partial order if in addition x < y and y < x 
imply x = y. A set with a partial order is called a poset (for partially ordered 
set). A a poset (or preorder) is directed if every pair {x,y} has an upper bound, 
i.e., some z for which x < z and y < z. A poset may have a largest element (also 
called a top element) denoted by | or T that satisfies x < T for each x € X, 
and/or a smallest element (also called a bottom element) 0 or L that satisfies 
L <x for each x € X. For x,z € X, the order interval [x,z] is defined by 


[xz] ={y|x<y <3}. (D.1) 


An atom in a poset with 0 is an element x 4 0 for which [0,x] = {0,x}. In other 
words, x is an atom if x £0, and 0 < y < x implies y = 0 or y = x. Thus x is an 
atom iff x covers 0, where we say that x covers y if x 4 y and [y,x] = {y,x}. 


A homomorphism between posets is a map that preserves <. As usual, an iso- 
morphism is an invertible (i.e. bijective) homomorphism, such that the inverse 
also preserves the given structure (which, in this case, is <). 

Thus a bijection @ : X — Y between posets X and Y is an isomorphism when 
(x) < @(y) iff x < y). In some cases, the inverse of a bijective homomorphism 
automatically preserves the relevant structure. 
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2. A lattice is a poset in which for any two elements x,y, there exists: 


e anelement x\ y, called the a supremum (sup) of x and y, such that 


XSXVY; (D.2) 
yoxvy, (D.3) 


and if x <zand y < zfor some z, thenxVy <z; 
e anelement x \y, called the infimum (inf) of x and y, such that 


xX>=xNy; (D.4) 
y2xy, (D.5) 


and if x > z and y > z for some z, thenx \y > z. 


Suprema and infima are unique (if they exist). Equivalently, a lattice may be de- 
fined algebraically (rather than order-theoretically) as a set equipped with two 
idempotent, commutative, and associative binary operations V, A that satisfy 


XV (yYAx) =x3 (D.6) 
xA(yVx) =x. (D.7) 


The corresponding partial ordering is then defined by x < yifxAy =x. 

3. A homomorphism between lattices is a map that preserves V and /. 

In this case, an isomorphism of lattices may be defined as a bijective homomor- 
phism, which automatically preserves V and /, (and similarly in all other cases 
below). One may also consider order homomorphisms between lattices just re- 
garded as posets. This is a weaker notion: a lattice homomorphism is an order 
homomorphism, but not necessarily the other way round. However, an order iso- 
morphism between lattices turns out to be the same as a lattice isomorphism. 

4. A complete lattice is a poset X in which each subset S of X has a supremum \/ S 
(i.e. x <\VS for each x € S, and x < z for each x € S implies \/ S < z), as well 
as an infimum /\S (i.e., x > /\S for each x € S, and x > z for each x € S implies 
\V S > z). Clearly, taking S finite makes a complete lattice (merely) a lattice. A 
complete lattice X has a largest element 0 = \/ X anda smallest element 1 = AX. 

5. A lattice is distributive if either one (and hence both) of the following equivalent 
properties holds: 


XV (yAz) = (xVy)A(xV 2); (D.8) 
xA(yVz) = (xAy)V(xAz). (D.9) 


6. A frame is a complete lattice X which is “infinitely distributive” in that 


xA\VS=\/{xAy,y € 5}, (D.10) 


for arbitrary subsets S C X. A frame is clearly distributive. Frame homomor- 
phism by definition preserve finite infima and arbitrary suprema. 
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7. A Heyting algebra is a lattice X with top T and bottom L, equipped with a map 


10. 


11. 


--9:X x X — X, called (material) implication that satisfies 
XS (y--> 2) if AY) Sz. (D.11) 
A Heyting algebra is automatically distributive. Negation is defined by 
ax = (x--> 1). (D.12) 


A Heyting algebra is complete when it is complete as a lattice, in that arbitrary 
suprema (and hence also infima) exist. In that case, (D.10) is satisfied, so that a 
complete Heyting algebra is a frame. Conversely, a frame becomes a complete 
Heyting algebra if we define the implication arrow --+ by 


y-->z=\/{x EX |xAy < z}. (D.13) 


However, frames and complete Heyting algebras drift apart as soon as morphisms 
are concermed, for although in both cases one requires maps to preserve the partial 
order, maps between Heyting algebras must preserve --+ rather than \/. 


. An orthocomplementation on a lattice (poset) X with 0 and | is a map 


1X eS, (D.14) 
that satisfies: 
xtt =x; (D.15) 
MY See (D.16) 
xAx~ =0 (xAx~ exists and equals 0); (D.17) 
xVx~ =1 (xVx~ exists and equals 1). (D.18) 


A lattice (poset) with an orthocomplementation is called orthocomplemented. 
A homomorphism of orthocomplemented lattices (posets) is an lattice (order) 
morphism that also preserves the orthocomplementation, as well as 0 or 1. 


. A lattice is called modular if x < z implies xV (y \z) = (xV y) Az for each y (i.e., 


if distributivity holds merely if x < z). 

Hence modularity is a weakening of the following property: 

A distributive orthocomplemented lattice is called Boolean. A homomorphism 
between Boolean lattices is just a homomorphism of orthocomplemented lattices. 
An isomophism of Boolean lattices is a map that preserves \V, \, and L, i.e., an 
invertible homomophism. 

An orthocomplemented lattice (poset) is called orthomodular if x < z implies 
(that xV z+ and hence x+ Az exist and that) 


£V (Az) =z. (D.19) 
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That is, the modularity axiom holds for y = x* (note that x V (x+ Az) = z exists 
because x < x V z+). For lattices this axiom is equivalent to each of: 


e x<zandx+Az=O0 imply x=z 

© xCy iff yCx, where xCy if x = (xAy)V (xA yt) (ie., x and y are compatible). 
In a Boolean lattice any two elements are compatible, reconfirming the fact 
that orthomodularity is a weakening of modularity and hence of Booleanity. 
A homomorphism between orthomodular lattices (posets) is a map @ that 
preserves 0 and | (and hence preserves 1), and satisfies @(xV y) = @(x) V 
~(y) (if x < y+). An isomorphism between orthomodular lattices (posets) is 
an invertible homomorphism, which is automatically an order isomorphism. 


Every Boolean algebra is a Heyting algebra, but not vice versa; a Heyting algebra is 
Boolean iff one and hence both of the following equivalent conditions hold: 
ax = x (x €X); (D.20) 
(-x) Vx =T (x EX), (D.21) 
which state the law of the excluded middle (famously denied by Brouwer). 
The following result will be used implicitly throughout the main text. 


Proposition D.2. An order isomorphism of a lattice preserves all suprema and in- 
fima that exist. Hence in a complete lattice all suprema and infima are preserved. 


An important source of orthocomplemented lattices is provided by (possibly 
infinite-dimensional) complex vector spaces V with inner product, cf. Definition 
A.1: the elements of X are the orthoclosed subspaces L C V, i.e., those subspaces 
for which L'+ = L, where L++ = (L+)+, and orthocomplementation is defined by 


L+={veEV|VweL: (v,w) =0}, (D.22) 
and the partial ordering is given by inclusion. This yields 


LAM =LoM; (D.23) 
LVM = (L+M)++ =(L+nM?+)-, (D.24) 


where L+ M is the linear span of L and M. We have the Amemiya—Araki Theorem: 


Theorem D.3. The lattice of orthoclosed subspaces of an inner product space V is 
orthomodular iff V is complete in the norm (A.2) associated to the inner product. 


A space X is called totally disconnected if it has no other connected subspaces 
than its points (so any larger subspace 4 X is the union of two proper clopen sets). 


Definition D.4. A Stone space is a totally disconnected compact Hausdorff space. 


Any finite set (with the discrete topology) is a Stone space. The best-known example 
of an infinite Stone space is the Cantor set {0,1} with product topology, which in 
addition is metrizable and has no isolated points (these properties even characterize 
the Cantor set up to homeomorphism). Stone’s Representation Theorem reads: 
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Theorem D.5. A lattice L is Boolean iff it is isomorphic to the lattice Clopen(X ) of 
all clopen subsets of some Stone space X (partially ordered by set-theoretic inclu- 
sion), where X is uniquely determined by L up to homeomorphism. 


Thus the lattice operations in Clopen(X ) are simply geven set-theoretically by 


UVW=UuwW: (D.25) 
UAV =UNW, (D.26) 


with orthocomplementation given by set-theoretic complementation (the theorem is 
obviously predicated on the fact that such a lattice is Boolean). The space X is called 
the Stone spectrum of L, generically denoted by .7(L). Just like Gelfand duality, 
Theorem D.5 extends to a categorical duality theorem in an obvious way. 

The Stone spectrum .”(L) of L has the following canonical realizations: 


1. Consider the space Pt(L) = Hom(L,2), where 2 = {0,1} is seen as a Boolean 
lattice ordered by 0 < 1 (and 0 ¥ 1), with topology inherited from the product 
topology on 2". That is, the basic opens in Pt(L) are the sets 


Ur={@ € PL) | p@) = 1}, (D.27) 
where x € L, and similarly with 1 ~~» 0. This is a Stone space, with isomorphism 


L > Clopen(Pt(L)); (D.28) 
x U;. (D.29) 


2. Generalizing the case of a power set (cf. Definition B.49), a filter in a (Boolean) 
lattice L is a nonempty subset F C L such that x,y € F implies xy € F, and 
y >x € F implies y € F (whence | € F). A filter F is proper if F A L, which 
is the case iff 0 ¢ F. An ultrafilter is a filter that is maximal in the set of all 
proper filters, ordered by inclusion. Ultrafilters (i.e. maximal filters) in a Boolean 
lattice are the same as prime filters, which are filters for which x V y € F implies 
x € F ory € F. More generally, in a distributive lattice with 0 any maximal filter 
is prime, and the presence of an orthocomplementation also gives the converse 
inclusion. Moreover, a filter F in a Boolean lattice is maximal (and hence prime) 
iff for any x € L either x € F or x+ € F (but not both). For x € L, let 


UL={FeEU(L)|x€F}, (D.30) 


where @% (L) is the set of all ultrafilters on L. One has UU, = Ujay, as well 
as U; UU) = Ujyy, Uy C UY if x < y, and subsets U; C % (L) form the basis of a 
topology on Y (L) whose open sets are sets U’ C Y (L) with the property that for 
each F € U’ there is x € L with F € Ul. CU’. This topology makes Y (L) a Stone 


space, whose basis of clopen sets is given by the U/, x € L, with isomorphism 
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L > Clopen(Y (L)): (D.31) 
ee UG (D.32) 


3. Instead of filters, one may consider the dual notion of ideals, obtained by revers- 
ing the order (and hence swapping A and V). Thus an ideal in L is a subset J C L 
such that x, y € Jimplies x V y € J, and y <x € Jimplies y € J (whence 0 € J). An 
ideal J is proper if I ~ L, which is the case iff 1 ¢ J. A maximal ideal is an ideal 
that is maximal in the set of all proper ideals, ordered by inclusion. In a Boolean 
lattice, maximal ideals coincide with prime ideals, which are ideals J that do not 
contain 1, and where x \y € J implies x € J or y € J. Ina distributive lattice with 0 
any maximal ideal is prime. The (set-theoretic) complement of a maximal ideal 
is a maximal filter (i.e. an ultrafilter), so that an ideal J in a Boolean lattice is 
maximal (and hence prime) iff for any x € L either x € J or x+ €1 (but not both). 
The space -¥(L) of all maximal (i.e. prime) ideals in L is topologized by basic 
opens U{! = {I € Y(L) | x ¢ I}, and so this time the desired isomorphism is 


Le Clopen(.7 (L)); (D.33) 
xu". (D.34) 


4. Finally, the set Idl(L) of all ideals in a (Boolean) lattice L is a frame if it is 
partially ordered by inclusion (cf. §C.11). One may realize the points of the frame 
Id1(L) as its prime elements (cf. Lemma C.85), which are simply the prime (and 
hence maximal) ideals in L considered above. Hence Pt(Idl(Z)) forms a model of 
the Stone spectrum X of L, too. The advantage of this realization is that it gives 
a direct description of the topology of X (seen as a frame), namely as 


O(X) 2 Idl(L). (D.35) 


The relationship between the first three approaches is that for any @ € Pt(L), the 
set p~'({1}) is a maximal filter in L, whose complement g~!({0}) is a maxi- 
mal ideal. This can be shown to give homeomorphisms Pt(L) = Y(L) = -¥(L), 
under which the opens U,;, U;, and U!’ are mapped to each other. The (contravari- 
ant) functorial nature of the Stone spectrum comes out particularly clearly in the 
first description: given a homomorphism h : L > L’, we immediately obtain a map 
h* : Pt(L') > Pt(L) by pullback (i.e., h*@ = @oh). In this description the iso- 
morphism X — Pt(Clopen(X)) is given by x ++ @,, where @,(U) = ly(x), with 
U © Clopen(X). In the second description, the isomorphism X Sy (Clopen(X)) 
is given by x ++ {U © Clopen(X) | x € U}, which also gives the isomorphism 
X > F(Clopen(X)) of the third description as x 4 {U € Clopen(X) |x ¢ U}. 

Eq. (D.35) follows from Theorem D.5, which implies an isomorphism of frames 


O(X) = Idl(Clopen(X)); (D.36) 
U + {V € Clopen(X) | V CU}, (D.37) 
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with inverse J++ Uye;U. However, by itself, eq. (D.35) may also be taken as a 
constructive version of Stone’s Representation Theorem; the next, non-constructive 
step (relying on Zorn’s Lemma) then gives the points of X from Idl(Z), cf. §C.11. 
To close this brief introduction to lattice theory, we present a general construction 
of free distributive lattices, possibly with relations, which will be needed for the 
theory of the constructive Gelfand spectrum in $12.2. The main advantage of this 
construction is that it can be performed in any topos, as will indeed be done in §12.4. 


Definition D.6. The free distributive lattice Zs on a set S is the set of irredundant 
finite subsets {A\,...,An} of the finite power set Pf of S, i.e. Aj CS, |Ai| <%, 
n EN, and no A; is a proper subset of any Aj, with lattice operations inductively 
generated (using distributivity) from the following singleton cases: 


{ish} V {{a} = tts}, teh}; (D.38) 
{ish} A{{t = ttt}. (D.39) 


For {A1,..-,An} € Ys as above, and similarly {B,,...,Bm} © Ye, these rules imply 


{A1,---,An} V {B1,---,Bm} = {A1,...,An,Bi,...,Bm }irs (D.40) 
{Aj,...,An}A{B1,...,Bm} = {AjUB; |i=1,...,.n,j=1,...,mbir, (D.41) 


where the subscript ir means that redundancies in the above sense have been re- 
moved by deleting any set on the list that properly contains some other set on the 
list. The motivation for this rule is that, using distributivity, any element x of a dis- 
tributive lattice can be brought into the (“normal’’) form x = x; V--+VxX,, where each 


x= yt) Ave Ay) is a finite meet. We then identify A; with yl), vee yy, so 
that x; = \A;, and identify {A;,...,An} with x; V---V x,. If we allow empty sets 
(as we do), then “5 has both a bottom element | = \/@ and atop element T = AO. 
Consequently, an equivalent description of Zs is to first define the set Y of all 
formal expressions inductively defined by the rules: (i) $ CY, 1 € XY, and T € Y; 
Gi) ifx € XY andy € XY, thenxVy EX andxAy € L. Secondly, we quotient XY by the 
equivalence relation generated by all of the basic identities in a distributive lattice, 
i.e., the commutativity, associativity, idempotency, and distributivity laws for V and 
A, the rules x VL =x and x \ T =x, and the absorption law x V (x A y) =x. The 
lattice operations on the quotient are the ones inherited from concatenation on 2. 
As in most free constructions, the map S++ Zs is left adjoint to the forgetful 
functor from the category of distributive lattices into Sets. One has a canonical map 
i: S—+ &%, given by i(s) = {s}, with the universal property that any function f : 
S — L from S to some distributive lattice L factors through -%, i.e., there is a unique 
lattice homomorphism g : 2s — L such that f = g oi. Indeed, g may be inductively 
generated from the special case g({{s}}) = f(s) using the rules (D.38) - (D.39). 
One may enrich this construction by introducing a congruence ~ on Zz, e.g., one 
generated by relations x; = y;, i € I. In that case, the ensuing quotient %s/ ~ exists, 
and is universal for homomorphisms f : “s — M of distributive lattices that satisfy 
f (xi) = fv), Le., if p: Ls + Ls/ ~ is the canonical projection, there is a unique 
homomorphisms of distributive lattices g : (%s/ ~) — M such that f = gop. 
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D.2 Propositional logic 


The topos-theoretical approach to quantum logic discussed in Chapter 12 uses an ad- 
vanced version of an elementary construction in algebraic logic that relates classical 
propositional logic to Boolean algebras (or lattices), and similarly relates intuition- 
istic propositional logic to Heyting algebras. Tough easy to state, these relationships 
are conceptually quite deep, based as they are on a separation between syntax and 
semantics that is decidedly “modern”, reflecting a view on the nature of mathematics 
that would have been completely foreign to e.g. Newton and Euler or even Gauss, 
not to speak of Euclid and Archimedes, notwithstanding their use of the axiomatic- 
deductive method that has been a defining property of (real) mathematics since its 
birth in Plato’s Academy. As expressed by Boole himself, this modern view is: 


‘They who are acquainted with the present state of the theory of Symbolic Algebra, are 
aware that the validity of the processes of analysis does not depend upon the interpretation 
of the symbols which are employed, but solely upon the laws of their combination.’ 
(Boole, 1847, Preface) 


The formalization of mathematics starts with propositional logic, whose notation 
consists of the following groups of symbols in terms of which a theory is defined: 


1. Purely logical symbols =, A, V, and — (which, because of the axioms they will 
be subject to, may later be interpreted as not, and, or, implies, respectively). 

2. Non-logical symbols p,,p2,... (also written p, p’,... or p,q,.--), which denote 
atomic propositions (being the simplest examples of propositions, see below). 
The set © = {p1,...} (at most countable) is called the signature of a theory. 


As in arithmetic, there is some ambiguity to be dispelled. This may be done either by 
introducing brackets (, ), subject to obvious rules we omit, or by conventions to the 
effect that — “binds” symbols more strongly that V and A, which in turn “bind” more 
strongly than —>. For example, -a@ V 6 — B A vis the same as ((=@)V 5) > (BAY). 

In propositional logic (unlike in first-order logic), well-formed formulae and 
propositions coincide; typically denoted by Greek letters a, B,..., both are defined 
as expressions in the above symbols that (iteratively) arise in the following way: 


i) Each non-logical symbol p1, p2,... present in the signature 2 is a proposition. 
ii) If a and B are proposition, then so are aA B, aV B, a > B, and 7a. 


Also here one may use brackets in the obvious way, e.g., if @ is pj > po, and B is 
Pi pz, then (p; > pz) — (pi \p3) is the same as a > B. 
For example, one may check that the following expression is a valid proposition: 


(pi (p2 > p3)) > (Pi > p2) > (P1 > Da). (D.42) 


A final informal symbol we use is =, as in & = B, which has no logical meaning, 
but states that @ is the same as B (e.g., for @ = (p1 — (p2 > p3)), consider =q@). 

The notion of a (propositional) theory will be picked up later, but we now inter- 
rupt the construction of the syntax of propositional logic and discuss its semantics. 
In its most elementary form, this means that there is a valuation on X, i.e., 
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V:5 = {0,1}, (D.43) 


also called a truth function, where 0 = false and 1 = true; one often writes a = 1 
for V(a@) = 1 (i.e., @ is true, and @ = 0 if a is false (this formally introduces a new 
symbol “=”, which however is foreign to propositional logic). Let By be the set of 
all propositions (i.e., well-formed formulae) on the given signature 2. With abuse 
of notation (justified by the property Y C By), V uniquely extends to a function 


V : By > {0,1}, (D.44) 


as follows. First, each V (p;) is fixed by the given function (D.43). Second, the value 
of V on compound expressions is (iteratively) determined through the use of truth 
tables, which formalize the everyday meaning of the symbols =,/A,V,—: 


a BlaAB a Blav Bp a Bla B 
a|c0] 00] 0 00] 0 00; 1 
0; 1 O01; 0 Ol; 1 01 1 
1/0 10) 0 10; 1 10; O 

11} 1 11; 1 11 1 


The first table should be read as follows: if a is false, then —q@ is true, and if a 
is true, than =a is false. Similarly, the second table means that if @ and B are both 
false, then so is @ A B, etc. For example, to see if y= p; A (—pz) is true or false given 
the valuation p; = p2 = 0, we first look at the truth table for = with @ = po, inferring 
from the first row that =p2 = | als p2 = 0. We subsequently inspect the table for A 
with @ = p, and B = pp. Since p, = p2 = 0 is the same as p; = 0 and 7p? = 1, 
we look at the second row, obtaining y = 0. Another example, just involving the 
implication symbol —, is (D.42), given e.g. p; = 1, p2 = 0, and p3 = 1. This is 
settled through the following steps, each of which involves the table for -: 


1. Taking a = p2 = 0 and B = p3 = 1, the second row gives (p2 > p3) = |. 

2. Taking a@ = p; = 1 and B = (p2 > p3) = 1, row 4 yields (p; > (p2 > p3)) = 1. 

3. Similarly, (pj + p2) =0 and (p; > p3) = 1. 

4. From these, the second row gives ((p1 + p2) > (pi > p3)) = 1. 

5. Finally, taking a = (pi > (p2 > p3)) = 1 and B = ((p1 > p2) > (p1 > p3)) = 1 
in the fourth row gives 


(pi — (p2 > p3)) > ((p1 > p2) > (pi > p3)) = 1. (D.45) 


The proposition in (D.42) is actually rather special, in that all truth values for the 
atomic propositions (p1, P2, p3) it contains make it true (as is easily checked). 


Definition D.7. A proposition @ that is true whatever the (un)truth of the atomic 
propositions it contains, is called a tautology, denoted by F @. 
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For example, & — @ is a tautology for any proposition a; this follows from the truth 
table for + by replacing B by q, in which case only the first and the fourth rows 
are consistent (both yielding 1). Introducing a new logical symbol + by stipulating 
that a <> B is the same as (& > B) A (B — @), then one easily proves: 


Theorem D.8. The proposition a <> B is a tautology iff a and B are either both 
true or both false for each joint truth value of the atomic propositions they contain. 


Here & and f need not contain the same atomic propositions, but if they do, this 
proposition says that a + f is a tautology iff @ and B have the same truth table. 

Here and in what follows, one should distinguish theorems about logic from the- 
orems within logic. The former are themselves derived from logical rules that can be 
formalized, as first done by Hilbert and his school in “meta-mathematics”. The lat- 
ter is what we now turn to, motivated by the above semantic intermezzo. The syntax 
of any logical system, such as propositional logic, is completed by stating axioms 
and deduction rules that enable one to prove theorems. In the case of propositional 
logic, these are propositions (i.e., expressions correctly formed from rules i) and ii) 
above) that can be derived from the axioms and deduction rules in a finite number 
of steps, starting with (some of) the axioms and applying (some of) the deduction 
rules to the previous step of the proof. The axioms are considered to be theorems, 
too. Theorems are often denoted by Q, and to show that a proposition @ is indeed a 
theorem we write + @. Thus the question if  @ holds is purely syntactic, and hence 
is independent of the truth-value of the atomic propositions p; in @. 

This is a baby version of the fundamental idea of Boole mentioned above, that the 
possible meaning of mathematical symbols should not affect the validity of mathe- 
matical reasoning about them, Nonetheless, there is a consistency requirement (on 
the axioms and deduction rules) that one should not be able to derive @ if @ is se- 
mantically false under some truth assignment to the atomic propositions it contains. 
In other words, a theorem must be true for any truth assignment to the pertinent 
atomic propositions, or, then again, a theorem within propositional logic must be a 
tautology, symbolically: + @ implies F @ (meta-mathematically). This is the sound- 
ness condition on any logical system. Conversely, one would like to prove as many 
true propositions as possible. Optimally, this is expressed by the completeness con- 
dition that F g imply  @. If both hold, i.e., if a system is sound as well as complete, 
one has F 9 iff F @: in (other) words, a proposition is a theorem iff it is a tautology. 

Achieving this should be the goal of our axioms and deduction rules. This can 
indeed be done in propositional logic (and also in first-order logic, on a suitable in- 
terpretation of F, see §D.4). Even this requirement does not fix the axioms and the 
deduction rules, although it clearly makes any two such systems equivalent, in the 
sense that each leads to the same theorems (namely the tautologies). In particular, 
one can switch between axioms and deduction rules (matters like this were first sys- 
tematically sorted out by Hilbert and his school, notably Bernays and Ackermann, 
partly motivated by the Principia Mathematica of Russell and Whitehead). 

One particularly convenient choice has just a single deduction rule, namely: 


e Modus ponens: ift a and+ a > B, thent B. 
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Even so, the axioms of propositional logic may be stated in many different ways. 
Although it is even possible to use a single logical symbol (namely the Sheffer 
stroke |, called NAND in computer science, where | means —(a@ A B)), we proceed 
less radically and initially use two symbols. To this end, it is easy to show that 


aNBo-7(a—>-B) (D.46) 
avBon-a- Bp (D.47) 
are tautologies, so that in principe the symbols V and A are superfluous, in that a A B 


may be regarded as an abbreviation of =(@ — —f), and likewise, a V B stands for 
=a —> B.A possible choice for the axioms that regulate — and —, is: 


+ B + (a B); (D.48) 
Pay 0) PUP ey) (Po) (D.49) 
L (na > AB) > ((na > B) > a). (D.50) 


The third axiom axiom settles the use of — and, jointly, with modus ponens, justifies 
proof by contradiction or reductio ad absurdum: suppose one has established 


+ 7a—- Bp; (D.51) 
+ AaB, (D.52) 


then (D.50) and modus ponens yield (=a + B) > a. Axiom (D.48) and modus 
ponens then yield a. Furthermore, as another proof technique (i.e. a theorem about 
propositional calculus) one can prove the deduction theorem: 


Theorem D.9. If a and ((1,.-.,%m) imply B, then (,.--,%) imply- a B. 
Introducing an external implication symbol =, such statements are often written: 

(O25 MIF B> (Nn) Fa B. (D.53) 
Writing the external “and” as a comma, one can similarly prove the rules 


BoY,y76=>B-6; (D.54) 
B-(y>6),y>B-6. (D.55) 


As already mentioned, the central result about propositional logic is: 


Theorem D.10. For any proposition @, one has + 9 iffF @. 


Proof. We only prove the easy direction. Axioms are tautologies, and modus ponens 
preserves truth, in that F @ and F a > B imply F 8, as follows from the fourth row 
of the truth table for @ — B. Hence each step in a proof preserves tautologies. 


Nonetheless, the notions of theorem and tautology are quite different conceptually: 
the first is defined syntactically, whereas the latter is defined semantically. 
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At the other end of the spectrum, we mention an axiom system that involves all 
four logical connectives (whilst keeping modus ponens as the only deduction rule): 


t (BAY) > B; (D.56) 
F (BAY) >% (D.57) 
t B > (y> (BAY)): (D.58) 
t B > (BV Y); (D.59) 
Fy (BV); (D.60) 
F (B + 6) > ((y> 6) > (BV Y) > 9)); (D.61) 
t B > (YB); (D.62) 
Rip (> 0) (Bane Bs): (D.63) 
ap (Pay y); (D.64) 
Ripa) (By) > a8): (D.65) 
t AB > B. (D.66) 


We now describe the relationship between propositional logic and Boolean alge- 
bras. Define an equivalence relation ~ on the set By of propositions by 


go~vwiffwr@andgry, (D.67) 


where, as in (D.53), the notation y' @ means that @ can be derived from y, which 
is the case iff / wy © @. The ensuing set of equivalence classes 


Ly =By/~ (D.68) 


is called the (classical) Lindenbaum (-Tarski) algebra for the given signature ¥. 


Theorem D.11. The set Ly defined by (D.68) is partially ordered by 


[v] <lolifyt 9. (D.69) 


In this ordering, the ensuing poset is a Boolean algebra, with operations 


[vw] V [9] = [wv 9]; (D.70) 
[wv] A [9] =[wA 9]; (D.71) 
[v= [>y}. (D.72) 


Furthermore, the bottom and top elements of Ly are the equivalence classes of 
any contradiction and any tautology, respectively. The Boolean algebra Ly thus 
obtained is the free Boolean algebra By on the set X, and hence any valuation 
(D.44) - (D.43), induces a homomorphism of Boolean algebras 


V: Br -> {0,1}. (D.73) 
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Here the free Boolean algebra By on a set X is defined as usual, namely as “the” 
Boolean algebra (unique up to isomorphism), along with an injection1: 2 > By, 
such that any map g : 2 — A, where A is some Boolean algebra, factors through 1 
(i.e., there is a unique homomorphism f : 4s — A such that g = fot). 

Constructions like this become more interesting for propositional theories, in 
which (beyond specifying the signature ©) further axioms are added to whatever 
system for which Theorem D.10 holds. Let us call the list of such axioms .7, where 
we assume that the theory is consistent, in that no contradiction can be derived 
from 7 (in propositional logic—as opposed to predicate logic—this question is 
decidable). We also assume that Y contains no tautologies (which would add no 
new theorems). We now write 7 | 9 if @ can be derived (in a finite number of 
steps) from 7 and the basic axioms and decuction rule(s). Unless 7 is empty, the 
set of theorems will be larger now (e.g., any member of 7 itself, say F py, is trivially 
a theorem of 7). In order to preserve Theorem D.10, now in the form 


Teoiff TZEQ, (D.74) 


we should define the right-hand side appropriately. Call a valuation (D.43), or, 
equivalently, the corresponding homomorphism (D.73), a binary model of 7 if 


V(a) =1, (D.75) 


for each a € Y C By (by soundness this is already the case for the axioms of 
propositional logic per se). We then say that 7 F @ iff V(@) = 1 (Le., @ is true) in 
any binary model of 7. On this definition of .7 F, eq. (D.74), and hence Theorem 
D.10 (with 7 added to the axioms), holds. Moreover, for a, B € By, define 


ang B iff TH (ao B), (D.76) 
where the right-hand side stands for (.7,a) - B and (.7,B) + a. Then define 


Liz,7) =Bz/~z, (D.77) 


and (partially) order L(y) by [y] < [9] iff (7, y) F @; as before, this is equivalent 
with 7 + (w > @). This construction obviously generalizes (D.67), etc. Then The- 
orem D.11 holds (mutatis mutandis) for L(y, 7). In particular, L(y, 7) is a Boolean 
algebra, which can also be shown to have the following universal property. 

A model of 7 in some Boolean algebra B is a map V : Y — B whose unique 
extension V : By — B makes the axioms of 7 true, i.e. V(@) = T foreach ae F 
(where T is the top element of B). Note that a +> [a] is a model of 7 in Liy_ 7). 


Theorem D.12. For each model V :  — B of 7, there is a unique homomorphism 
V' : L(y, 7) > B of Boolean algebras such that V(a) = V'({a]) for each a € By. 
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D.3 Intuitionistic propositional logic 


In view of its importance for quantum mechanics and topos theory, we now briefly 
discuss the intuitionistic version of the preceding material on (classical) proposi- 
tional logic. Intuitionism in mathematics originated with the Dutch mathematician 
L.E.J. Brouwer (1881-1966), who was also one of the most important early con- 
tributors to the field of (algebraic) topology. Brouwer held a rather subjective view 
of mathematics (sometimes even tending towards solipsism), in which mathematics 
primarily resided within the mind of the “creative subject” (perhaps the right trans- 
lation of Brouwers’ “scheppend” is: “creating” rather than “‘creative”). Any means 
of communication supposedly weakened this effort, so that Brouwer saw the formal- 
ization of mathematics (including logic) as secondary and even potentially danger- 
ous; he openly (and polemically) opposed his views to the “formalism” he attributed 
to Hilbert, with whom he also fell out personally. A more technical consequence 
of Brouwer’s intuitionism was an emphasis on explicit constructions, rejecting not 
only proofs by contradiction, but even the abstract existence of mathematical objects 
in general (as claimed by the so-called Platonic philosophy of mathematics). 

Brouwer’s lasting influence on logic is partly due to his student Arend Heyting 
(1989-1980), who was less radical than his teacher and formalized (!) intuitionis- 
tic logic analogously to its classical counterpart. In fact, the system (D.56) - (D.65), 
with modus ponens, gives axioms for intuitionistic propositional logic, which there- 
fore differs from classical propositional logic exclusively by the absence of the law 
of the excluded middle (D.66). It is customary in intuitionistic logic to use the purely 
logical symbols A, V,— and L, in terms of which negation is defined by 


na=a-> 1. (D.78) 
In that case, axiom (D.65) is simply replaced by 
Flo a, (D.79) 


and in the presence of (D.56) - (D.64) with (D.79), the axiom that makes the system 
classical may now be formulated as the validity of reductio ad absurdum, 1.e., 


F((a>L)> L)> a, (D.80) 


which is therefore denied in intuitionistic logic. Similarly, classical rules like: 


aVvaAad; (D.81) 
a7AV AQ; (D.82) 
(na > =B) > (B > a); (D.83) 
(a + B)V(B > @); (D.84) 
a(7aA 4B) > (av B); (D.85) 
a(7aV 7B) > (a AB), (D.86) 
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are invalid in intuitionistic logic, as is, of course, (D.66). Fortunately, as theorems 
of intuitionistic propositional logic one does have: 


Fas -7d; (D.87) 
LE aAa7na 4 7a; (D.88) 
+ (a > B) > (=B > 7a); (D.89) 
Lt =aV AB + 7(aA B); (D.90) 
F a(&V B) > (na AB); (D.91) 
+ (a > B) > (“AB > 7a). (D.92) 


More generally, Gédel’s negative translation of classical (propositional) logic into 
intuitionistic (propositional) logic establishes the fact that if one puts —— in front of 
atomic propositions and recursively replaces @ V B by —(=a@ A —B), which changes 
nothing classically, the ensuing proposition is intuitionistically valid. In this sense, 
intuitionistic logic is stronger than classical logic, although at first sight it looks 
weaker (as is has fewer axioms). Also more generally, one often sees that classi- 
cal results whose proofs apparently rely on intuitionistically invalid reasoning are 
classically equivalent to intuitionistically valid results. (e.g. Gelfand duality). 

A natural (and complete) semantics for intuitionistic propositional logic is given 
by Heyting algebras (replacing the Boolean algebras of the classical case). Let Is 
denote the set of all propositions (i.e., well-formed formulae) on some signature Y 
built from the letters p € X and the symbols A, V,— and L, where in formation rule 
i) preceding (D.42) we also declare _ to be a proposition, and we omit —q@ at the 
end of rule ii), as it is a special case of the preceding part with (D.78). If H is a 
Heyting algebra, we may then extend any function V : 2 — H to a function 


Vils->H (D.93) 
by recursively using the following rules, where e is A, V, or > in Jy and --> in H: 


V(L=L; (D.94) 
V(aeB) =V(a)eV(B). (D.95) 


Then each axiom @ of intuitionistic propositional logic is valid, in that 
Vig)=T. (D.96) 
Moreover, if I” is some finite set of propositions, then 
Dk @ implies V (Ar) <V(Q). (D.97) 
In particular, suppose we a theory .7. As in the classical case, we call a valuation 


(D.93) a model of 7 if (D.96) holds for each 9 € J. It then follows from (D.97) 
that each model V of 7 is sound in that for all propositions @ one has the rule: 
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ZF+*@ implies V(g) =T. (D.98) 


That is, @ is true in the given model. As in Theorem D.10, soundness and com- 
pleteness of Heyting algebra semantics of intuitionistic propositional logic are then 
jointly expressed by the following result (where | denotes derivability using only 
the intuitionistically valid axioms (D.56) - (D.64) with (D.79), and modus ponens): 


Theorem D.13. For any theory Z in intuitionistic propositional logic, J + @ holds 
iff 7 F @, i.e, V(@) =T for all Heyting algebra models V : Ix > H. 


The classical construction of the Lindenbaum algebra may also be copied by defin- 
ing Ly and Lys, 7) as (D.67) - (D.68), where this time the symbol F defining ~ 
through (D.67) or (D.76) is the one using the intuitionistically valid axioms only. 
It follows that any Heyting algebra model V : Jy — A factors through a homomor- 
phism Ls — H of Heyting algebras, just as in the classical case (cf. Theorem D.11). 


Kripke models are special Heyting algebra models, which already form a com- 
plete semantics for intuitionistic propositional logic. For any poset X, the set 


Upper(X) = G(X) (D.99) 


of all upper subsets U of X (i.e. y <x € U implies y € U), which by definition 
coincides with the set @(X) of open sets in the Alexandrov topology on X, is a 
Heyting algebra in the partial order defined by inclusion, with V = U, A =/, and 


U --+V ={xEX | (tx)NU CV}. (D.100) 


Given a valuation V : © — Upper(X) with associated Heyting algebra homomor- 
phism V : Iy — Upper(X), for any x € X and @ € Iy we write x IF 9 iff x € V(@), 
and say that x forces g. Then V(@) = T iff. x IF for all x € X, and we have: 


xlF @ andy >x imply ylF 9; (D.101) 
xlk L fornox€X; (D.102) 
xlk pAW iffxlF @ and x IF y; (D.103) 
xlF eV w iffxlk @ orxlF yw; (D.104) 
xlk @ > w iff for ally > x: yl @ implies ylF y; (D.105) 
x\- nq iff for ally > x, ylF @ is false. (D.106) 


Hence these are properties of any homomorphism V : Jy — Upper(X); originally, 
(D.101) - (D.105), which imply (D.106), were taken to be axioms extending a binary 
“forcing” relation x Ik p on X x X to X x Ty. In topos theory, generalizations of the 
rules (D.101) - (D.106), once again theorems rather than axioms, will provide the 
Kripke—Joyal semantics of the (intuitionistic) internal logic op toposes (cf. §E.5). 
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D.4 First-order (predicate) logic 


Propositional logic lacks the structure to describe arithmetic (not to speak of set the- 
ory), because it has neither variables—as we shall see, the symbols p; are not vari- 
ables but predicate symbols—nor quantification symbols like ‘there exists’ (4) and 
‘for all’ (V). This defect is remedied by the formalism of predicate logic, also called 
first-order logic, which was essentially introduced by Frege and was adopted by 
Hilbert’s school as a universal language for mathematics (as they knew it), in which 
for example the Zermelo—Fraenkel (ZF) axioms for set theory may be formulated 
as a foundation of mathematics (against competitors like the Principia Mathematica 
system of Russell and Whitehead, and others). A simple mathematical theory that 
can be formalized using classical first-order logic is Peano Arithmetic (PA). 


e The notation of a first-order theory consists of symbols from two groups: 


1. The purely logical symbols are the familiar symbols =, \, V,— from proposi- 
tional logic (or some logically independent subset thereof, such as — and —), 
supplemented by the equality sign = and the quantification symbols V and J 
(the latter is in fact superfluous in the classical system discussed here, since, 
the combination 4, defined below is the same as —V,—). 

2. Unlike the ones above, the non-logical symbols (comprising the signature of 
the theory) depend on the field of mathematics to be formalized (such as set 
theory or arithmetic), but the general format is as follows. One has: 

a. Variables a,b,c,...,X,y,Z,X1,X2,-.-, assumed countable many at most. For 
example, in PA these variables may be thought of as denoting natural num- 
bers, whereas in ZF they will be sets, but of course such interpretations do 
not form part of the syntax! This warning also applies to the next items. 
In many-sorted theories the variables are sorted, in that there is a set 
{A,B,...} of sorts, and each variable x = x, belongs to one of these sorts. 

b. Constants, arbitrarily formatted. For example, PA has just one constant, 
called 0, to be interpreted as the number zero. Also ZF has just a single 
(even superfluous) constant @, to be interpreted as the empty set. 

c. Function symbols f,g,.... Each such symbol has an arity a(f), which is 
a natural number indicating the number of variables it has (as formalized 
below). Formally, one allows a(f) = 0, in which case f is also a constant. 
PA has three function symbols, viz. S, +, and x, with arities a(S) = 1, 
a(+) =2, and a(x) = 2 (these will be interpreted as the successor function 
nt+n-+1, addition, and multiplication, respectively). Perhaps surprisingly 
(especially in the light of category theory), ZF has no function symbols: in 
set theory, functions f : X — Y are defined as special subsets of X x Y. 

d. Predicate symbols P,..., coming with an arity a(P) € N, too. These will 
play a role in the construction of formulae, see below (some authors count 
= as a predicate symbol with arity 2, instead of as a purely logical symbol). 
PA has no predicate symbols. ZF has one predicate symbol €, with arity 2. 
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e According to rules we are about to state, from these symbols one subsequently 
constructs terms, formulae, and sentences (or closed formulae). Sentences are 
at least candidates for theorems, in that one may attempt to prove them (and may 
either succeed or fail, the latter even in two possible ways: the sentence may be 
false, in that its negation can be proved, or it may be undecidable, in that neither it 
nor its negation can be proved—it was Hilbert’s outspoken intention to exclude 
the last possibility, which however was famously shown to be unavoidable by 
Gédel). For example, both x? =1andJ we = 1) are formulae in PA, but only the 
latter is a sentence, which is even a theorem. The rules, then, are as follows. 


1. Term formation is done by iterating the steps: 
a. Any variable x; is a term. 
b. Any constant is a term. 
c. Any function symbol f and any set of k = a(f) terms (t1,...,t,) jointly 
yield a term f(f1,...,t,); if a(f) = 0 this reduces to the previous case. 
In PA, this means that S(t) is a term, and that 4) +f. = +(t,f) andt) xh = 
 (t1,f2) are terms (provided f, t;, and fz are terms). For example, the constant 
0 is a terms, and hence S(0) is a term, which one calls 1. Similarly, S"(0) is 
a term called n, where e.g. S7(0) = S(S(0)), etc.). From these, we can make 
terms n+m, orn x x;, and subsequently (n+ m) x (n x x;), enz. 
In ZF, the only terms are @ and the variables (as ZF lacks function symbols). 
2. Formulae are (once again iteratively) constructed from terms using the equal- 
ity sign = and the predicate symbols, according to the following rules: 
a. If t; and fo are terms, then f; = fp is a formula. 
b. Any predicate symbol P and any set of k = a(P) terms (¢1,...,fa(p)) jointly 
yield a formula P(t,,...,fa(p)); if a(P) = 0, then P is a formula by itself. 
c. As in propositional logic: if @ and y are formulae, then so are =@, MV Y, 
go AY, and ~ > yw. What is new to first-order logic is that also 4,@ and 
Vx@ are formulae, for any variable x (which may or may not occur in @). 


In PA, the expression f = f2 is a formula (provided ¢; and fz are terms). 
In ZF, the expressions f; € t2 and t; = fz are formulae (if ¢; and f2 are terms). 
3. A variable x occurring in a formula @ is called bound if it only occurs via 
Ve W(x) and/or 4, y;(x), where y is a subformula of @; otherwise, x is free. A 
formula containing at least one free variable is called open; if x occurs freely 
in an open formula Q, the latter is sometimes called @(x), and analogously 
(x1,---,Xn). If all variables in a formula are bound (or if it contains no vari- 
ables at all), then it is said to be closed. A sentence is a closed formula. 


e Axioms are, syntactically speaking, special cases of formulae. As for propo- 
sitional logic, we may either keep A and V, and add adding (D.46) - (D.47) 
as axioms, or, equivalently, we may see a@ A B and a@ V B as abbreviations for 

(a@ — 3B) and ~a > B, respectively. Similarly, we may see 5 as a derived 
symbol, in that 4,@ is an abbreviation for -V,7@. 
As in propositional logic, the axioms for predicate logic come in two groups: 
purely logical axioms and domain specific axioms. We will state the latter for the 
theories PA and ZF in §D.5 below, and now discuss the former (common to both). 
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From propositional logic, we adopt (D.48) - (D.50), where a, B , y, 6 are arbitrary 
formulae. These are also Axioms I—3 of predicate logic, to which one adds: 


Axiom 4 : + (Wx@(x)) > @(t) for any term t (unless x occurs freely in @ through 
a subformula Vy where y occurs in ft); some authors write @(x/t) for p(t). 

Axiom 5 :F (¥(@ > W)) > (Vi@ > Vey). 

Axiom 6: Vx(x =x). 

Axiom 7 :FY,y((x=y) > (@(x) > @(y))) for each formula @ that contains the 
variable x freely and contains y either freely or not at all. 


e The only two deduction rules of predicate logic (for formulae @, W) are: 


1. Modus ponens:| (g — w) and 9 imply ' w. 
2. Universal generalization: | @(x) implies + V,@(x). 


These rules also apply to theories, provided that in the second, 7 + (x) implies 
TF EY,.@(x) provided no formula in 7 used in the proof of 9 freely contains x. 

e A theorem is a sentence @ that can be proved form the axioms using the deduc- 
tion rules in a finite number of steps. In that case, we write F @. 

e A theory 7 isa set of formulae (assumed contradiction-free, although for e.g. ZF 
this cannot been proved within ZF because of Gédel’s Incompleteness Theorem). 

e An interpretation of a theory 7 consists of a nonempty set M (the carrier of 
the interpretation), elements [[c]] € M for each constant c, functions [[f]]~ : 
M“f) 5 M for each function symbol f of arity a(f), and subsets [[P]]y c M(”) 
for each predicate symbol P of arity a(P). The interpretation [[@]]v of a formula 
@ then follows by giving the logical symbols —=,/\, V,—, and = their usual mean- 
ings of “not”, “and”, “or’, “implies”, and “is equal to”, whereas the range of each 
variable x occurring in V; or 4, in taken to be M. If a sentence @ is true in this 
interpretation, we write MF @. If each axiom of 7 is true, then we call the given 
interpretation a model of 7 (so that in a model, 7 + @ implies M F @). 


Gédel’s Completeness Theorem (to be contrasted with his incompleteness theorem, 
which roughly states that any first-order theory that incorporates PA contains unde- 
cidable sentences) generalizes Theorem D.10 and eq. (D.74) to first-order logic: 


Theorem D.14. A first-order theory 7 is consistent iff it has a model. In that case, 
a sentence 9 of J is a theorem iff it is true in all models of 7. 


Propositional logic is a special case of predicate logic, namely by assuming 
no variables, constants, and function symbols, and taking the atomic propositions 
(p1,---) to be predicate symbols with arity zero (or else {0,1 }-valued variables). 
The rules of term formation in predicate logic then show that propositional logic has 
no terms, so that step 2.a above is empty, and step 2.b only yields the p;. These may 
be turned into compound expressions by the original uses of propositional logic, 
which in this case coincide with the rules of predicate logic (since there are no 
variables, 4,.@ and V,@ are both equivalent to @). Finally, formulae coincide with 
sentences, since in the absence of variables, all formulae are closed. 
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As a transition to the next appendix, we continue our discussion on intuitionistic 
logic started in §D.3. The propositional fragment of first-order intuitionistic logic is 
still given by (D.56) - (D.65), in which the connectives \,V,—, and — (or L) are 
independent. The equality sign = is treated with suspicion in intuitionism, and hence 
is omitted, whilst 3 can no longer be defined in terms of V through the classical 
identification of 4, with =V,-. Instead, it is regulated by the two axioms 


K(V.(@ > W)) > (29 > AW); (D.107) 
F Q(t) > 5,@(x), (D.108) 


subject to the same proviso as Axiom 4 of the classical case, plus a deduction rule: 


e 5-elimination: | 3,@ implies | 9 (provided x is not free in @). 


This will be the logic on which the topos theory of the next chapter is based. Scary 
examples of intuituitionistically invalid rules involving V and J include: 


AV, 79(x) 6 5,0(x); (D.109) 
Vi (x) Ve Q (x); (D.110) 
75,9 (x) 4 Ay (x); (D.111) 
(9 > Ary(x)) > Ar(P > Wr), (D.112) 


whereas useful intuituitionistically valid theorems containing V and 3 are, e.g., 


AP (x) Ver Q(x); (D.113) 
AAV, P(x) 4 V7 7 (x); (D.114) 
773 (x) & AVG (x). (D.115) 


Gédel’s negative translation of classical logic to intuitionistic logic extends to first- 
order logic: if, further to the manipulations mentioned after (D.92), one also replaces 
4,9(x) by =V,7@(x), then theorems @ of classical first-order logic are turned into 
theorem of intuitionstic first-order logic. Although we will not use it, we mention 
that the notion of a Kripke model also extends from propositional to predicate in- 
tuitionistic logic: compared to a classical model carried by a set M, as described 
above, we now have a family of (classical!) sets (M,) indexed by some poset P, 
in which constants, functions, and predicate symbols are similarly interpreted as 
families [[c]]u, € Me, ((LfTlm, : Mi“ > My), and (([P]]u, cM), such that if 
x <y, then My C My, [[e]lu = [[ellay» G((Lfllu,) € G([Lflla,) (where G(f) is the 
graph of f), and [[P]]i, ¢ [[P]],. Further to the forcing rules (D.101) - (D.106) for 
intuitionistic propositional logic, there are additional ones for 5 and V, viz. 


x|F 4(x) if there exists m € My, such that x lf [[@]], (™); (D.116) 


x|F V,@(x) if for ally > x and allm € My one has y|F [[@]]v, (m). (D.117) 


We will revisit these rules in topos theory, see §E.5; indeed, Kripke models for 
intuitionistic predicate logic emerge much more naturally in categorical language. 
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D.5 Arithmetic and set theory 


Completing our running examples (for classical first-order logic), we now give the 
theories PA and ZF, starting with the axioms of Peano Arithmetic: 


PALF Y,(7(S(x) =90)); 

PA2 FE ViVy(S(x) = S(y) > x=y); 

PAZ) FF Y,(x+0 =x); 

PA4 EF Y,Vy (4+ S(y) = S(x+y)); 

PAS FY,(xx0=0); 

PA6 FEF Y,V\(x x S(y) = (xx y) +x); 

PAT (@(0) A(V¥x(@(x) > @(S(x)))) > Vr @(x), for any formula ~(x). 


Thinking of the variables in question as natural numbers (which is what Peano him- 
self still did), these axioms obviously capture their properties pretty well and may 
require no further explanation (except perhaps the last one, which enables the proof 
technique of induction). The point, however, is that the axioms only form a syntax; 
the natural numbers N (as a set) themselves form a model of PA in the general sense 
discussed in the previous section (though by no means the only possible model, and 
hence N is called the standard model of PA). In particular, this means that the set 
N is assumed to be known (e.g. via ZF, see below), upon which the interpretation 
[[Q]]n of some formula @ in PA is determined by the rules given earlier. In particular: 


e The constant 0 is interpreted as the number zero. 

e The function (symbol) S is interpreted as S(x) = x-+ 1, whilst the functions + 
and x are interpreted as addition and multiplication, respectively. 

e The range of all variables is taken to be N, 1.e., V, means “for all x € N’, and 4, 
means “there exists x € N”. 


According to the general definition, a sentence @ of PA is then called true in the 
given model (i.e., in the natural numbers) if [[@]]n is true, in which case we write 
NE @. For example, [[V:V)(x+y =y+-)]]~ means that for all natural numbers x,y € 
N, one has x+ y= y+x (which is true, isn’t it). Another example is 1+ 1 = 2, which 
abbreviates S(0) + S(0) = S(S(0)). The interpretation of [[1+ 1 = 2]]n is given by 
1+ 1 =2 (which once again is true!). In particular, the above axioms of PA are true 
in this interpretation. The key conceptual point here is that (following Hilbert) one 
interprets a theory in a domain that is supposed to be known and consistent, so that 
it has its own methods of proof (for otherwise the semantic entailment symbol F 
would be undefined). In this particular case, the domain is ZF set theory (or at least 
its lower echelons); see the comments to axioms ZF7 below. 

It is quite instructive to see the crucial role of the seemingly technical axiom PA7. 
Suppose we try to define a model of PA in the set Q* of positive rational numbers 
(including zero), so that V, means “for all x € Q*”, and 4, stands for “there exists 
x € Q*”; the number zero (as the interpretation of the constant 0) and the functions 
S, +, and x have their usual meaning, however. Then all of PA1—PA6 hold, but PA7 
fails, and hence the given interpretation of PA in Q* is not a model of Pa. 
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The axioms of ZF are a trifle more complex than those of PA, but then they are 
supposed to describe all of mathematics! We use the following abbrevations: 


re (D.118) 
aoBp=(a>B)A(b-> a); (D.119) 
x#y=-(x=y); (D.120) 
x¢y=-(x ey). (D.121) 


Other notation of ZF will be explained in the text following the axioms, which are: 


ZF1L oF YVyy((Ve(z Ex OzZ€y)) Ox=y) (Extensionality) 
ZF20 EF WrAVe(((z Ex) A Q(z)) OZ EY) (Separation) 
ZF3 - 7Fyx EO (Empty set) 
ZF4 FM wayV-(z Ey & (z=v) V(z=w)) (Pairing) 
ZFS oF VA e(z Ey 4 dyexz € w) (Union) 
ZF6 EVA (z Ey ZC X) (Power set) 
ZE7T FAO ExXAV (y Ex yt Ex)) (Infinity) 
ZF8 EF Vu ((Vreud! (%,Z)) 4 AyVe(z € y 4 ArveuP(x,z))) (Replacement) 
ZF9 FE VyzoFrerVy(y Ex 4 y Ev) (Regularity) 
AC FWydw((wC Alu) Xu) A Vreaiuy (X FO > Alyex < x,y >€w))) (Choice) 


In ZF2 and ZF8, @(-) is an arbitrary formula with at least the specified free vari- 
ables, so that these axioms are more properly thought of as axiom schemes. 

These axioms have been the subject of entire monographs, but we will be brief 
here. All intuition about the axioms comes from “naive” sets, although the whole 
point should be that the axioms stand on their own, and circumvent the problem 
of defining sets conceptually (as Frege and Cantor desperately tried to do, much 
as Euclid tried to define in vain what a point is, before he was was liberated by 
Hilbert). The axioms may be put into two groups: Axioms ZF1, ZF3, ZF9, and 
AC are concermed with given sets, whereas nos. ZF2, ZF4, ZF5, ZF6, ZF7, and 
Z¥F8 regulate the way new sets may be constructed from old ones. Here are some 
comments on the axioms one by one (which should, however, be seen as a whole). 


ZF1 states that a set is determined by its members (which themselves are sets!). 


Z¥F2 is a correct version of the naive idea of Cantor, Dedekind, and Frege that 
every property (or predicate) defines a set. If we look at a predicate as a formula 
@(z) stating that z has a certain property, the naive idea of these gentlemen was 
that y = {z| @(z)} is a set. This idea would be secured by the axiom 


A,V.(9(z) 42 Ey), (D.122) 


which however leads to Russell’s Paradox (in which @(z) = z ¢ z). 
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The crucial difference between ZF2 and this naive version is that in ZF one re- 
stricts set formation to those z that satisfy @(z) and are a member of some set x 
that is already given. By ZF1, the set y defined by ZF2 is unique; it is written as 


y= {zeEx| o(z)}. (D.123) 


This notation introduces the familiar brackets {--- } from naive set theory, which 
are therefore derived concepts not belonging to the notation of ZF. This is also 
true for most of the other symbols from naive set theory (except €, which is a 
predicate symbol in ZF). For example, for arbitrary “sets” x and v (which so far 
are really just variables in ZF), we introduce xMv as a name (i.e., an abbreviation) 
for the set y defined by taking @(z) in ZF2 to be z € v. Using the notation (D.123), 
this defines the symbol /M (for “intersection”) by 


xNv={zEex|zev}. (D.124) 


ZF3 states that 0, which was the only constant in ZF, has no elements. According 
to ZF1 this set is unique, so that @ may be thought of as the empty set. In partic- 
ular, ZF3 implies that there are sets in the first place (instead of defining it as a 
constant, one could alternatively introduce the symbol @ at this stage). 

An equivalent form of ZF3 is: F V;7(x € 0), also written as Vix € 0. 


ZF4 states that for given sets v and w, there exists a set y with exactly those two 
members. We write this y as y = {v,w}, which uses brackets {--- } consistently: 
in ZF2 take @(z) to be (z =v) V (z = w) and take x to be the y just considered. 
This may be iterated, so that we may write {x1,...,x,} for the set y that satisfies 


Vat postn ayWVelZ Ey  (Z= 21) V---V (Z= 4a); (D.125) 
this set is unique by ZF1. Using the notation from ZF2 we may then write 
{x1,...,%n} = {ze y| (z=x1)V--- V(z=4n)}. (D.126) 


ZF5 postulates the existence of a set y whose elements are the elements of x. In 
this axiom, the generic notation 


dwerW = dw((w ex)Ay), (D.127) 


is used, where y is some formula, which in ZFS is z € w. We write y = Ux, which 
defines the symbol U, i.e., 


Ux= {ze y| dwexz € wh}, (D.128) 


where y = Ux is the set whose existence is guaranteed by ZFS. In the special 
case x = {x1,...,Xn}, we write 


Xp U+++UX, =U {xq,..., Xn}. (D.129) 
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Z¥F6 calls for each x to have a power set y. The notation 
ZCxX=V(y Ez yEx), (D.130) 


defines the symbol C; note that z = x is allowed, so that C=C. As usual, the set 
y is unique due to ZF1, and is denoted by A(x), whose elements are therefore 
the subsets z of x. We may write this 4 la (D.123) as (y being the set from ZF6): 


PiayS{zey|2cx}, (D.131) 
ZF7 postulates the existence of a set y whose elements are 


0,07 = {0}, {O}* = {0, {OF}, {9, {O}}* — {0, {0}, {0, {OF}, os (D.132) 


in which the notation 


yt =U, b}} =yU fy}, (D.133) 


is underwritten by ZF5. Hence the elements of yt are the elements of y, supple- 
mented with the single element y. Following von Neumann, the sets in (D.132) 
are called 0,1,2,3,..., respectively, where 0 is identified with the empty set, and 
n > Ois realized in a very specific way. Thus ZF7 states the existence of a set con- 
taining 0,1,2,3,.... The intersection of all sets with this property is the smallest 
set containing 0,1,2,3,...; this is the smallest infinite set, called @. In the stan- 
dard model of ZF (see below), @ is (a copy of) the set N of natural numbers. 


ZF8, in which @ should not contain y, states that if some formula @(x,z) assigns 
exactly one z to any given x, then these z form a set, provided the variables x form 
a set (i.e., uw). Such a formula @ is really a function f so that f(x) =z, and hence 
this axioms states that the image of any set under some function is again a set. 
Using the notation (D.123), we then have 


f(u) = {z Ey | creu@(x,z)}. (D.134) 


ZF9 is the most contrived axiom in ZF, stating that every nonempty set v contains 
some element x disjoint from x. Its formulation uses the generic abbreviation 


Vio W = VWy((4.z € v) > W) (D.135) 


Using the symbol M from (D.124), one easily checks that V,(y € x > y ¢ v) is 
the same as xv = Q, in terms of which ZF9 reads 


L Vy carey (xv = 8). (D.136) 


This implies x ¢ x, which avoids all kinds of paradoxes (though not Russell’s, 
which was taken care of by ZF2). Moreover, ZF9 enables transfinite induction. 
AC warrants the choice of an element of each nonempty subset of any set. Indeed, 
rewriting the expression d!y<,(x,y) € was dlyeu((x,y) Ee wAy € x), AC reads 
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FWusw((wC Au) xuyA (Vreaiu) (X40 > Alyeu(x,y) €wAy €x))). (D.137) 
As we Shall see shortly, this shows that there exists a function 
f: Alu) —u (D.138) 


that maps x ¢ A(u) to f(x) €u, such that Ve yy) (x AO > f(x) € x). Although 
dy is undefined in (first-order) ZF, one may therefore informally rewrite AC as 


Vu fp: P(u)—u Ve Plu) xt oF (x) € x. (D.139) 


We now formally define functions, which, as already noted, are curiously absent 
in ZF (which lacks function symbols). This relies on the following theorem of ZF: 


F VuvVxy (x € uA (y €v)) > {{x}, fay} } © A(A(uyy)). (D.140) 


We now introduce the abbreviation 


<x,y >= {{x}, {x,y}}, (D.141) 


which by (D.140) is an element of the double power set A(A(uUv)) (assuming 
that x € u and y € y); this notation makes < x,y > an ordered pair, as opposed to 
{x,y} = {y,x}. The (cartesian) product of two sets u and v is now defined as the set 


uxv={ze P(P(uvv)) | Frewdyev% =< x,y >}, (D.142) 


i.e., in ZF2 we substitute x ~» Y(A(uUv)) as well as 


@(z) > Freudyev2 =< x,y >, (D.143) 


and denote the (unique) set y thus defined by u x v. Informally, one often writes 
uxv={<x,y>|xEu,y EV}. (D.144) 


We are now in a position to define functions in ZF set theory: 


Definition D.15. A function f : u — v is a subset Gf C u Xx v for which 


Views lyer < x,y > € Gy. (D.145) 


Here A!yc,y(y) abbreviates 


Ay((y Ev) A (We(W(z) 4 z=y))), (D.146) 


cf. (D.127)), which yields (D.145) upon the substitution y(y) >< x,y >€ Gy. 
More generally, one has 


Aly w(y) = AyV.(w(z) 4 z=y). (D.147) 
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Hence in ZF set theory a function f is defined by (or even identified with) its 
graph G-, which closes the historical circle: Newton clearly looked at what we now 
call functions through their graphs, upon which Euler began to assign some value 
f(x) € v to x € u (though always through some concrete prescription). The 19th 
century brought the abstract idea of a function as a map between sets, which, as we 
just saw, ZF set theory replaced by the view that a function is defined by its graph. 


Compared to the standard interpretation of PA in the natural numbers, which was 
a special case of the general notion of a model described in §D.4, the standard model 
of ZF is unusual, in that its carrier is not a set (but a so-called class), called the set- 
theoretic universe (or cumulative hierarchy) V, whose construction was first given 
by none other than von Neumann, whose name already pervaded this book. We 
will not go into the details of this construction except by noting that—much as the 
natural numbers may be built from zero by repeated use of the successor function 
S—the universe V is constructed from the empty set @ by “repeated” use of: 


e The successor operation V+ Vt =V U{V}, cf. ZF7. 
e The union operation V +> UV, cf. ZE5. 
e The power set operationV > Y(V), cf. ZF6. 


However, what is really meant here by “repeated” defies imagination (and may drive 

one crazy); fortunately, most of mathematics only uses the lower echelons of V. 
Furthermore, interpreting the constant @ by the usual empty set (with the same 

name), the interpretation € of € in V needs to be defined. This is done as follows: 


1. There exists no set Z such that Ze 0. 

2. One has ZeV* iff ZeV or Z=V. 

3. One has Ze UV iff there exists We V with ZeW. 

4. One has Ze A(V) iff Z C V, where we say Z C V iff for all Ye Z one has YeV. 


Here V, Y, and Z are sets in V. Applying these rules “iteratively” (see, however, the 
above comment on “repeated”), for all sets X and Y in V, it can in principle be estab- 
lished whether or not X€Y, so that the symbol € is defined within V. Having access 
to the universe V, €, and the empty set 0, one may then define the interpretation 
[[~]]v of some formula @ of ZF in V by the following rules (cf. PA and N): 


1. The range of all variables is V, i.e., V,@(x) means that @(V) holds for all V € V. 
2. The constant @ is interpreted as the empty set. 
3. The predicate symbol € is interpreted as the membership relation €. 


A sentence @ in ZF is then frue, denoted by V F Q, if [[@]]y is true. For example, all 
axioms of ZF are true in this interpretation (which is by no means trivial!). 

In particular, in this model we interpret 7 (see the explication of ZF7 above) as 
the n-fold iteration of the successor operation to Q, i.e., 7 = 0+ (with n pluses), 
seen as an element of V, and recover the standard model of the natural numbers 
(and hence the carrier of the standard interpretation of PA) as N = U,n, which is the 
intersection of all sets in V that contain all sets 7 (for any finite 7). 
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Notes 


The “modernist” transformation of mathematics led by Hilbert, including its com- 
plete prehistory and aftermath, is delightfully described in Gray (2008). The revo- 
lutionary nature of Hilbert’s views, which started with his influential book Grund- 
lagen der Geometrie from 1899, is nowhere clearer than from his correspondence 
with Frege (cf. Gabriel et al, 1980), who, though one of the fathers of the formal- 
isation of mathematics (specifically through first-order logic), infuriated Hilbert by 
stating that the latter did not bother to define the notions of “point” or “line” because 
Hilbert assumed these to be familiar to his readers. But no, quite to the contrary: 


‘Hier liegt wohl der Cardinalpunkt des Misverstandnisses (...) Ich will nichts als bekannt 
voraussetzen (...) Wenn ich unter meinen Punkten irgendwelche Systeme von Dingen, z.B. 
das System: Liebe, Gesetz, Schornsteinfeger ..., denke und dann meine saémtlichen Ax- 
iome als Beziehungen zwischen diesen Dingen annehme, so gelten meine Sitze, z.B. der 
Pythagoras, auch von diesen Dingen.’ (Hilbert to Frege, 29-12-1899).! 


This may be an exaggeration, however. Einstein probably came closer to the truth: 


‘An dieser Stelle nun taucht ein Ratsel auf, das Forscher aller Zeiten so viel beunruhigt hat. 
Wie ist es méglich, dai die Mathematik, die doch ein von aller Erfahrung unabhangiges 
Produkt des menschlichen Denkens ist, auf die Gegenstinde der Wirklichkeit so vortre- 
fflich paBt? Kann denn die menschliche Vernunft ohne Erfahrung durch bloBes Denken 
Eigenschaften der wirklichen Dinge ergriinden? 


Hierauf ist nach meiner Ansicht kurz zu antworten: Insofern sich die Satze der Mathematik 
auf die Wirklichkeit beziehen, sind sie nicht sicher, und insofern sie sicher sind, beziehen 
sie sich nicht auf die Wirklichkeit.’ (Einstein, 1921).? 


The great irony is that Hilbert’s call for abstraction, which at first sight decoupled 
mathematics from its origins in physics and other applications, in fact very rapidly 
led to the deepest applications of mathematics to physics so far, such as the use of 
(pseudo) Riemannian geometry in general relativity, and the use of Hilbert (!) spaces 
and operator algebras in quantum mechanics. In the present book, a high point of this 
paradox is the use of Grothendieck toposes (cf. Appendix E) in quantum mechanics 
(see Chapter 12), especially because Grothendieck himself almost made a sport of 
extreme abstraction, partly motivated by internal mathematical needs in algebraic 
geometry, but undoubtedly also by his indignation about the use of (mathematical) 
physics for military purposes (which put him diametrically against von Neumann). 


' This is surely the central point of the misunderstanding (...) I do not want to assume anything 
as known (...) If l interpret my notions by arbitrary things, for example, by the system: love, law, 
chimney sweeper, and subsequently interpret my axioms as relations between these things, then 
my theorems, like the one of Pythagoras, hold about these things. (Translation by the author) 


? At this point an enigma presents itself, which in all ages has agitated inquiring minds. How 
can it be that mathematics, being after all a product of human thought which is independent of 
experience, is so admirably appropriate to the objects of reality? Is human reason, then, without 
experience, merely by taking thought, able to fathom the properties of real things? 

In my opinion the answer to this question is, briefly, this: as far as the propositions of mathe- 
matics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. 
(Translation: Sonja Bargmann) 
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§D.1. Order theory and lattices 

For lattice theory in general and Stone’s Theorem see Givant & Halmos (2009), 
Davey & Priestley (2002), and Johnstone (1982). For (D.36) - (D.37) see Theorem 
33 in Chapter 35 of Givant & Halmos (2009). 


§D.2. Propositional logic 

Halmos & Givant (1998) is an elementary exposition of the connection between 
Boolean lattices and logic. Other useful (propositional as well as first-order) logic 
texts include Bell & Machover (1977), Johnstone (1987), Kaye (2007), and Mendel- 
son (2010). 


§D.3. Intuitionistic propositional logic 

Key writings on intuitionism (at least from the Dutch school) include Brouwer 
(1907, 1918, 1975), Heyting (1956) and Troelstra & van Dalen (1988). See also 
Dummett (2000) for a view from abroad. Our treatment of Kripke models for intu- 
itionisistic propositional logic is taken from Goldblatt (1984) and Palmgren (2009). 


§D.4. First-order (predicate) logic 

For the history of first-order logic see Grattan-Guinness (2000) and Mancosu, 
Zach, & Badesa (2004), plus innumerable books about Frege, Russell, Hilbert, etc. 
It is regrettable that the close companionship of mathematics and philosophy at the 
time, whose cross-fertilization has given us both the modern foundations of mathe- 
matics on the one hand and analytic philosophy on the other, has not lasted. 


§D.5. Arithmetic and set theory For PA see e.g. Kaye (1991), which focuses on 
non-standard models. The bible of ZF set theory is Jech (2006). 


Appendix E 
Category theory and topos theory 


This appendix gives a brief introduction to category theory, moving towards the par- 
ticular categories that are of interest to quantum theory (viz. categories of presheaves 
and sheaves) as quickly as possible (but not more quickly). However, even the basic 
setup of category theory is already relevant for e.g. the conceptually most satisfac- 
tory formulation of Gelfand duality, as described below Theorem C.23 (see also 
Theorem C.45), and likewise of Stone duality, see Theorem D.5. Otherwise, this 
material will only be used in Chapter 12 on quantum logic. We omit most proofs. 
Categories were originally introduced by Eilenberg & Mac Lane (1945) in order 
to define natural transformations, through which they formalized (and explained) 
the intuition that certain isomorphism in mathematics are “natural” or “canonical” 
(like the one between the second dual V** of a finite-dimensional vector space V and 
V itself, as opposed to the isomorphism between V* and V). Natural transformations 
are predicated on categories and functors, i.e. maps between categories, which are 
analogous to continuous functions between topological spaces, and in turn give rise 
to new categories, similarly to functions giving rise to function spaces in functional 
analysis. Initially meant to organize certain fields of mathematics in a systematic 
way (such as algebraic topology and homological algebra), categories soon became 
objects of study in their own right. As such, the basic vocabulary of category theory 
is completed by defining adjoint functors (invented by Kan in 1958) and (co)limits. 
Toposes are categories with enough structure to support the interpretation of first- 
order (and even higher-order) intuitionistic logic, similar to set theory providing 
semantics for classical predicate logic, which in turn generalizes the relationship 
between propositional logic and Boolean algebra, cf. §D. In this respect, the pres- 
ence of a truth object (i.e. subobject classifier) partly explains their potential rel- 
evance to quantum mechanics. However, toposes were introduced in the 1960s by 
Grothendieck from a completely different motivation, namely algebraic geometry, 
and were originally seen by him as generalizations of topological spaces. This as- 
pect plays an equally important role for quantum mechanics, and hence we quote: 


‘A startling aspect of topos theory is that it unifies two seemingly wholly distinct mathe- 
matical subjects: on the one hand, topology and algebraic geometry, and on the other hand, 
logic and set theory.’ (Mac Lane & Moerdijk, 1992, p. 1). 
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E.1 Basic definitions 


The definition of a category emphasizes the idea that one is at least as interested 
in the maps between objects as in the objects themselves. The only complication 
(which we ignore) is the uses of classes; categories are often too big to be sets, and 
hence they require an axiomatization of mathematics different from standard ZF set 
theory (such as von Neumann—Bernays—Gédel set theory or algebraic set theory). 


Definition E.1. A category C = (C;,Co,i,s,t,m) consists of: 


e Aclass Co of objects. 

e Aclass C, of arrows (also called morphisms). 

e Maps s:C; + Co (the source map), t : Cj > Co (the target map), i: Co > Cy 
(the identities map), and m: C2 = Ci xc, Ci + C; (multiplication), where 


Ci Xqy Ci = {(f,8) € C1 x Ci | sf) = (a) }, (E.1) 
such that, writing fg =m(f,g) and id, = i(x), 
s(fg) =s(g); (E.2) 
t(fg) =t(f): (E.3) 
(fg)h = f (gh); (E.4) 
s(idy) =t(id,) =x; (E.5) 
fidy p) =idy pf =f. (E.6) 


Note that (E.4) is well defined by virtue of (E.2) - (E.3). We often write x 4, y or 


f :x— yor, even better in principle but cumbersome in practice (see below), y gf x; 
when f € C; satisfies s(f) =x and ¢(f) = y, and interpret f as an arrow from x to 
y, so that id, is an arrow from x to x. Composition f og = fg of arrows is defined 
whenever s(f) = t(g) (so that on paper the preferred direction of an arrow is from 
right to left!). Arrow composition is associative whenever defined, and each i(x) acts 
as an identity under this composition operation. The class of all arrows from x to y 
in a category C is sometimes written as Homc (x,y), or simply as Hom(x,y), when C 
is unambiguous. A category is called small if both Co and C; are sets (otherwise, a 
category is called large), and locally small if for each x,y € Co the class Homc(x,y) 
is a set (although C, itself may be a proper class). All categories used in this book 
are locally small (though not necessarily small). Here are some examples. 


e Sets has sets as objects and functions as arrows. Sets is a large category, but it 
follows from the ZF axioms for set theory that it is locally small (as promised). 

e Any set X with a preorder < (and hence any poset) defines a category X where 
Xo =X and Hom(x, y) contains a unique arrow iff x < y, being empty otherwise. 

e A small category in which each arrow is invertible is called a groupoid; see 
§C.16. In particular, any group may be seen as a category with just a single object. 
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Categories come with an intrinsic notion of isomorphism: one calls two objects 
x,y € Co isomorphic, written x = y, when there exist arrows f:x— yand g:y> x 
such that fg = id, and gf = id,. For example, two sets are in bijective correspon- 
dence iff they are isomorphic objects in Sets, two topological spaces are homeomor- 
phic iff they are isomorphic in the category of topological spaces and continuous 
maps, and two C*-algebras are isomorphic in the sense of Definition C.2 iff they are 
isomorphic in CA, where we define the following useful categories of C*-algebras: 


e CA, which has C*-algebras as objects and homomorphims as arrows. 

e CAm, again having C*-algebras as objects, but now with nondegenerate homo- 
morphims into the multiplier algebra as arrows (cf. Theorem C.76 etc.). 

e CAn, with C*-algebras and nondegenerate homomorphims (cf. Definition C.42). 

e CA), with unital C*-algebras as objects and unital homomorphims as arrows. 

e CCA, CCAm, CCAn, and CCA), i.e. the full subcategories of CA, CAm, CAn, 
and CCA), respectively, in which the objects are commutative C*-algebras. 


Here the notion of a subcategory C C D is the obvious one, i.e. Co C Do, Ci C Dy, 
and C is a category by itself (in particular, C is closed under the maps s,t,i,m). We 
say that C is a full subcategory of D if Homc(x,y) = Homp (x,y) for all x,y € Co. 
We now define the “canonical” maps between categories (which, in the spirit of 
the subject, are often more important than the underlying categories themselves!). 


Definition E.2. Let C and D be categories. A covariant functor or simply functor 
F':C— D consists of a pair of maps F;: C; + Dj, i=0,1, such that: 


ipo fo = Fi Oic; (E.7) 
Sp oF, =Foosc; (E.8) 
tp oF, = Footc; (E.9) 
Fi(fg)=A(/)Fi(g) (f,8 € C), (E.10) 


where ip : Do > Dy is the inclusion map in D, etc. 
A contravariant functor F : C > D is a pair F;: C; — Dj, i=0,1, such that: 


ip ofg = Fi Clic; (E.11) 
spoFf, =Foctc; (E.12) 
tp oF, = fo°Ssc;3 (E.13) 
Fi(fg)=Fi(g)Fi(f) (f,8 € @). (E.14) 


It follows that Fo is determined by F;, since i is injective, but nonetheless it is useful 
to keep them apart. The use of contravariant functors may be avoided by introducing 
the opposite category C°? of C, which has the same objects and arrows as C, but the 
latter going in the opposite direction (i.e. scop = fc, etc.). For example, if C = X is 
a preorder, in the category X°P the partial order is reversed. A contravariant functor 
F :C— Dis then obviously the same thing as a covariant functor F : C > D®, or, 
equivalently, F : C°? — D. This is very important for us, because Gelfand duality is 
based on contravariant functors and hence on opposite categories; see below. 
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Definition E.3. A natural transformation between two functors F: C— DandG: 
C = D (that are either both covariant or both contravariant) is a map T: Co + D4, 
written x + T;, such that sp(%) = Fo(x) and tp(t,) = Go(x)— in other words, 7 is 
a collection of maps % : Fo(x) —> Go(x) indexed by x € Co—such that the following 
diagram commutes for all arrows f :x — y: 


Fo(x) ——? Go(x) 
Fi nl Gif) (E.15) 
G 


Ty 

7 
Fo(y) ——> Go(y) 

Two functors F and G as above are called naturally isomorphic, written F = G, 

when there exists a natural transformation Tt between them for which all arrows 7 

are invertible (i.e., are isomorphisms). 


It follows that if F and G are naturally isomorphic, then Fo(x) = Go(x) for all x € Co, 
but this condition is not sufficient by itself to render F and G naturally isomorphic, 
for the isomorphisms 7, between F (x) and G(x) must be compatible with the arrows, 
as expressed by the diagram in the above definition; this is even the whole point! 

Definition E.3 clarifies the idea that the double dual V** of any finite-dimensional 
vector space V is isomorphic to V in a “natural” way: namely, the functor ** from 
the category of finite-dimensional vector spaces (over C) to itself (with linear maps 
as arrows) is naturally isomorphic to the identity functor through the natural trans- 
formation whose components Ty : V + V** are given by the “Gelfand transform” 
vt 0, where 0(@) = O(v) for @ € V*. In contrast, the dual V* is isomorphic to V in 
an “unnatural” way, in that any isomorphism depends on the choice of a basis. 


Definition E.4. Two categories C,D are called equivalent, written C ~ D, when 
there exist (covariant) functors F : C + D and G: D > C such that F 0G = idp 
and GoF ~ idc. Similarly, C and D are called dual when there exist contravariant 
functors with the same properties, i.e., if C and D°? are equivalent. 


Here idc is the identity functor on C, etc. Spelling out what this means, using Defi- 


nition E.3, yields the commutative diagrams 


Go 0 Fo(x) —~ x 


G\oF (| |r (E.16) 
T 


y 


Go 0 Fo(y) — y 


for all f : x y in Cj, where each 7, is invertible, and also for all f’ : x’ > y’ in Dy, 


roaits’)| |r (E.17) 
1% 
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We are now in a position to give a categorical (re)formulation of Gelfand duality. 
Further to the categories of commutative C*-algebras CCA;, CCAn, and CCAm 
defined earlier in this section, this involves the following categories of spaces: 


e CH, ie. the category of compact Hausdorff spaces and continuous maps. 
e LCHp, with locally compact Hausdorff spaces and proper continuous maps. 
e LCH, the category of locally compact Hausdorff spaces and continuous maps. 


Theorem E.5. There are categorical equivalences (i.e., dualities if ‘op’ is omitted): 


CCA, ~ CH; (E.18) 
CCAn ~ LCHp*?; (E.19) 
CCAm ~ LCH®?P. (E.20) 


Proof. In the proof of Theorem C.23, the maps evx provide a natural isomorphism 
between the functors idcy and XY oC from CH to itself, whilst the maps G4 perform 
the same job for the functors idcca, and Co from CCA; to itself; the naturality 
properties (C.40) and (C.41) precisely express commutativity of the above diagrams. 
Likewise for the other two cases, which restate Theorems C.45 and C.76. 


Similarly, Stone’s Theorem D.5 is best seen categorically, stating that the category 

of Boolean lattices (with homomorphisms preserving V, A, and | as arrows) is 

dual to the category of Stone spaces (as a full subcategory of CH). With hindsight, 

Stone’s Theorem (which predated category theory) was the first such duality result. 
Definition E.4 may be strengthened by replacing the isomorphisms 


FoG®Sidp; (E.21) 

GoF &idc, (E.22) 
by equalities, i.e., 

FoG=idp; (E.23) 

GoF =idc. (E.24) 


In that case, the categories C and D are called isomorphic. However, this is less rele- 
vant than the following weakening of the first two conditions, called an adjunction: 


Definition E.6. Two functors F :C + Dand G: D> C form an adjoint pair if there 
exist natural transformations n from idc to Go F (called the unit of the adjunction), 
and € from F 0G to idp (called the counit of the adjunction), such that the following 
diagrams of natural transformarions (i.e. the triangle identities) commute: 


Ff 2°), FOF G 2. GRE 


ou |eor Po | ove (E.25) 
F G. 


We write F 4G, and say that F is \eft-adjoint to G, or that G is right-adjoint to F. 
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It is easy to see that if they exist, left or right adjoints are unique up to isomorphism. 
If we assume that C is locally small (in that all classes Homc (x,y) and Homp (x’, y’) 
are sets), then the above definition states that the functors Homp(F(—),—) and 
Homc(—,G(—)), both defined from C°? x D to Sets, are naturally isomorphic. In 
other words, for each x € Cy and y’ € Do, we have a bijection: 


Homp(F(x),y’) = Homc(x,G(y’)) (E.26) 


that is natural in both variables x and y’ (ie., for each y’ € Do, the functors 
Homp(F(—),y’) and Homc(—,G(y’)) from C°P to Sets are naturally isomorphic, 
and for each x € Co, the functors Homp(F (x), —) and Homc(x,G(—)) from D to 
Sets are naturally isomorphic). Indeed, the natural bijection (E.26) is given by 


(Fe x y) = (x #, GF (x) a) : (E.27) 
(« as Gt) 4 (Fe PW) FG(y) y) : (E.28) 


This may even be interesting if C = D, and hence F : C+ Cand G: C > C. For 
example, a Heyting algebra H (seen as a posetal category) is home to an adjunction 


(iy ena), (E.29) 
for any fixed y € H, where, writing (E.29) as F 4 G as usual, we put 


Fo(x) =xAy; (E.30) 
Go(x) = (y --> x). (E.31) 


Definition E.4 of an equivalence of categories involves an adjunction F 4G 
whose unit and counit are both natural isomorphisms, as opposed to mere natural 
transformations, as in Definition E.6 of an adjunction. In that case, G is an inverse 
to F up to isomorphism of objects (which still falls short of an exact inverse, which 
as mentioned would lead to the less important notion of isomorphism of categories). 
But even for an adjunction, one may regard G as a weak kind of inverse to F, which 
allows one to move between categories in the direction opposite to F. 

Other than equivalences of categories, the traditional examples of adjunctions 
yield left adjoints to so-called forgetful functors, which strip some class of math- 
ematical objects of (some of) its structure. For example, if Grp is the category of 
groups and homomorphisms, the forgetful functor G : Grp — Sets sends a given 
group to its underlying set; this functor has a left adoint F : Sets > Grp that assigns 
the free group on a set X to X. Similarly for vector spaces, Boolean algebras, etc. 


We now move on to limits and colimits, whose general definition we precede by 
a few special cases. These abstract the corresponding constructions from Sets (and 
hence pave the way for topos theory, which resembles set theory in various ways), 
so that for the right “feeling” we switch to labeling objects in a category by capitals. 
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Definition E.7. Let C be a category (for simplicity assumed to be locally small). 


e A product of a pair X,Y € Co is an object X x Y € Co, with arrows p, :X x Y > 
X and p2:X x Y — Y, such that for all arrows q, : Z — X and qz:Z — Y, there 
is a unique arrow Z — X x Y making the following diagram commute: 


oe, es (E.32) 


If each pair of objects in Co has a product, C is said to have binary products. 
The next part of the definition relies on the following fact about products, which 
is easy to prove: given f :X — X’ and g:Y — Y’ in Cy, there is a unique arrow 


fxei:XxYoXx'xy’ (E.33) 


such that the following diagram commutes: 


r| . [are |: (E.34) 


e A function space or exponential of a pair Y,Z © Co in a category C with binary 
products is an object Z’ € Co (which in Sets is the set of all functions g : Y > Z) 
with an evaluation map ev : Z” x Y — Z (which in Sets is (g,y) +> g(y) € Z), 
such that for each f :X x Y — Z there is a unique arrow f : X —> Z* (which in 
Sets is f(x)(y) = f(x,y)) making the following diagram commute: 


a eee, 


ae i (E.35) 


e A terminal object is an object 1 € Co such that for each X € Co, there is a unique 
arrow X —+ 1 (in other words, Homc(X,1) contains precisely one element). 

e Acategory C having a terminal object, binary products, and function spaces for 
all objects, is called cartesian closed. 


The relationship between products and function spaces is just the adjunction 
(xrays (E.36) 


for each Y € Co, where the left-hand side denotes the following functor: 
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CayOeYs Cue: (E.37) 
XHXxY; (E.38) 
(f :X +X’) (f xidy :X XY SX'XY). (E.39) 


Here f x idy is a special case of (E.33), whilst the right-hand side of (E.36) is 


(-)/P: CC; (E.40) 
ZZ (E.41) 
(g: ZZ’) (goev: Z’ > (Z’)*), (E.42) 


where the arrow g ev is defined as in the text above (E.35), in which we substitute 
X ~» Z", Z~» Z', and f ~ goev; note that the latter is an arrow Z’ x Y + Z’. 
As in (E.26), the adjunction (E.36) gives a bijection 


Home (X x Y,Z) = Home (X,Z"), (E.43) 


which of course is precisely the correspondence f ¢+ f; the counit of (E.36) is € =ev 
(i.e., its component at Z is ev: Z” x Y — Z), whereas the unit (at Z) is the map 
f:X —Z”* corresponding to f :X x ¥Y —+ Z on the choices X ~» Z, Z~» Z x Y, and 
f[:ZxY-—ZxY being the identity arrow. 

The following construction, generalizing binary products, is very important. 


Definition E.8. The pullback of two arrows f :X — Z and g:Y — Z consists of two 
arrows p: P —> X and q: P —Y such that the following square commutes, and has 
the universal property that for any arrows p': P'’ + X and q': P' + Y with f p' = gq‘, 
there is a unique arrow h: P! — P such that the entire diagram commutes: 


(E.44) 
One says that q is a pullback of f over g, whilst p is a pullback of g over f. 
In the category Sets, pullbacks coincide with fibered products, that is, 
P=XxzY = {(x,y) €X xY| f(a) =80)} (E.45) 


where p and q are the projections on the first and the second coordinate, respectively. 
In particular, taking Z to be a singleton reproduces binary products as special cases 
of pullbacks. This can be done in all categories C with a terminal object. 

At last, we turn to limits and colimits in a category. A (finite) diagram in a 
category C is a functor D: J — C, where J is some (finite) category. In case that 
J is empty, we say that there is a unique functor D into C; even this is interesting! 
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The diagram just consisting of two objects X,Y € Co corresponds to Jo = {0, 1} with 
0 # 1 and only identity arrows. The next case is an arrow f : X — Y, obtained from 
Jo = {0,1} as a poset, i.e., 0 < 1. Finally, consider Jo = {0,1,2} with nontrivial 
arrows 0 — | and 2 — 1; this defines a diagram 


y4zdx. (E.46) 


For any C € Co, let Dc : J  C be the constant functor that sends all j € Jo to C, and 
all arrows in J to idc. A cone over a diagram D : J — C is an object C € Co (called 
the vertex of the cone) with a natural transformation from Dc¢ to D, i.e., a collection 
of arrows c; :C — D; = Do(j) indexed by j € Jo, such that for each arrow x : j > k 
in J;, with induced arrow J; (7) : Dj + Dx, the following triangle commutes: 


os }e w) (E.47) 


A cone over the empty diagram is just a loose object C. A cone over our two-object 
diagram without arrows is X «+ C — Y, whereas a cone over (E.46) is a commuting 
square as in (E.44). A limit of a diagram D : J — C is a universal cone over D, i.e., 
a cone (C, {cj :C> Dj} jeg) such that for any other cone (C", {c’,: C’ + Dj}j) for 
the same diagram there is a unique arrow h : C’ > C such that 


cjoh=ci, (j € Jo). (E.48) 


A more elegant way of phrasing this is via the category Cone(D), whose objects are 
cones over D, and whose arrows are arrows h : C' — C in C, satisfying (E.48). A 
limit of D, then, is just a terminal object in Cone(D). Either way, it is clear from 
the universal property that any two limits of a given diagram must be isomorphic. 
Despite this lack of uniqueness, the typical notation for a limit of a diagram D is 


C = lim ;Dj. (E.49) 


It should now be clear that a terminal object is a limit over the unique diagram 
over the empty category, a (binary) product is a limit over a two-object diagram 
obtained from Jo = {0,1} with only identities, and finally a pullback is a limit over 
the diagram (E.46) obtained via Jo = {0, 1} seen as a posetal category. 

Especially in connection with topos theory, the following fact is quite useful: 


Proposition E.9. A category has all finite limits (i.e. limits based on finite diagrams) 
iff it has all pullbacks and has a terminal object. 


Replacing C by its opposite category C°, we obtain the colimit C = lim ; Dj of a 
diagram, which is defined as a limit of the same diagram seen in C°?, so that in 
all definitions all arrows are reversed. Thus terminal objects are replaced by initial 
objects, products become coproducts, and pullbacks are turned into pushouts. 
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E.2 Toposes and functor categories 


The last ingredient we need for the definition of a topos is a categorical abstraction 
of subsets X C Y and their characteristic functions ly, i.e. a subobject classifier. 


Definition E.10. 7. An arrow m: X — Y in any category C is a monomorphism 
(or briefly a mono) if for any g,h: Z — X, the equality mg = mh implies g =h. 
Similarly, abstracting surjectivity rather than injectivity, e: X — Y is called an 
epimorphism or an epi if for any g,h:Y — Z, the equality ge = he implies g =h. 

2. Two monomorphisms m: X — Y and m! : X' — Y are equivalent if there is a 
(necessarily unique) isomorphism h: X — X' such that m = mh. 

3. A subobject of Y is an equivalence class of monomorphisms m: X — Y. The 
class of all subobjects of Y (which is not necessarily a set) is called Sub(Y). 

4. A subobject classifier in a category C is amonot :1— Q such that all pullbacks 
of t exist in C, and for any mono m: X — Y there is a unique arrow Xm :Y 3 Q 
(called the characteristic function or classifying map of m, or, loosely, of X) 
that makes the following diagram a pullback (and hence makes it commutative): 


a! 


x — 1 

m t (E.50) 
leone 
Yy Xm Q 


It follows that the object 1 is terminal in C (which of course constrains C to have a 
terminal object in the first place); Q is often called the truth object of C. 


Proposition E.11. Jf a locally small category C has a subobject classifier, then for 
any Y € Co, the class Sub(Y) is a set, and the map m+ Xm induces a bijection 


Sub(Y) = Homc(¥Y, 2). (E.51) 


Proof. It follows from the definition of a pullback that equivalent monos m: X — Y 
and m’ : X' — Y yield the same arrow Ym, so that the map m+3 7%» from monos to 
arrows passes to equivalence classes, i.e., we have a map |m] +> Ym. The universal 
part in the definition of a pullback (i.e., monos with the same classifying maps are 
isomorphic) makes the latter map injective, whereas surjectivity follows from the 
general fact that the pullback (namely m) of any arrow (namely 7) over a mono 
(namely ¢) is a mono, where we see (E.50) as a pullback for given x and t. 


For example, in Sets a mono is an injective function (and an epi is surjective), so 
that any mono into Y originates in some set that is isomorphic to some subset of Y. 
Any singleton 1 = « = {0} serves as a terminal object, and Sets has a truth object 


with subobject classifier t(*) = 1;if X C Y, and mis the inclusion map, then 7, = Ly 
is just the characteristic function of X, and Sub(X) = A(X) is the power set of X. 
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The haunting name “truth object” for Q might explain some of the fascination 
logicians and quantum physicists have felt for topos theory, which we now define: 


Definition E.12. A topos is a cartesian closed category (i.e., having a terminal ob- 
ject, binary products, and function spaces) with pullbacks and a subobject classifier. 


More precisely, this defines an elementary topos. It follows from Proposition E.9 
that a topos has all finite limits, and it can be shown that it also has all finite colimits. 
It should be clear that Sets is a topos; indeed, in our presentation the presence of 
the necessary ingredients of a topos within Sets partly motivated these ingredients. 
More generally, all toposes relevant to this book are of the following sort. We first 
note that for any two categories C and D we obtain a new category [C,D] whose 
objects are (covariant) functors from C to D, and whose arrows are natural transfor- 
mations between such functors. It is often natural to consider contravariant functors, 
giving the category [C°P, D]. If D = Sets, such functors are called presheaves on C. 
The category [C°P, Sets] is often denoted by Sets©”. An important special case is 


C= G(X), (E.53) 


i.e. the topology of some space X (seen as a posetal category); with slight abuse of 
notation, functors F : @(X )°? — Sets are called presheaves on X. 


Theorem E.13. For any small category C, the category [C°?, Sets] is a topos. 


Proof. We focus on the subobject classifier; the remainder following from the fact 
that limits in [C°P, Sets] (including pullbacks and function spaces) are computed 
pointwise, i.e., if D: J — [C°P, Sets] is a diagram, then for each C € C; we obtain 
a diagram Dc in Sets defined by Dc(j) = D(j)(C). Since Sets has all limits, we 
obtain limits GC for each Dc. These form a single functor @, which is a limit of D. 

The simplest example is the terminal object in [C°?, Sets], which comes out as 
19(C) = « for each C € Cy, where * is some arbitrary (but fixed) singleton. 

To discuss the truth object in [C°?, Sets], we need a few definitions. 


Definition E.14. 7. In any small category C, a sieve on an object C € Co is a set S 
of arrows with target C such that if f € S, then fh € S whenever fh is defined. 


2. The maximal sieve so on C consists of all arrows with target C. 
3. The pullback sieve f*S (on D) over an arrow f : D — C consists of all arrows 


f'S={h:X + D| fhe S}. (E.54) 


4. We denote the set of all sieves on C by Sieves(C). 
Clearly, if idc € S, then S = el. We will show that the truth object in [C°?, Sets] is 


Qo(C) = Sieves(C); (E.55) 
A(f)=f-. (E.56) 
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The subobject classifier in [C°? , Sets], then, is the natural transformation 


t:13Q; (E.57) 
tc : 19(C) — Sieves(C); (E.58) 
to(*) = SQ. (E.59) 


To understand this, we need the Yoneda Lemma E.15 below. In preparation, for any 
(fixed) C € Co, we define a functor yc : C°P > Sets by 


(yc)o(D) = Homc (D,C); (E.60) 
(ve): (D4 D’) =(¢ > sf), (E61) 


the latter being a map from Homc(D’,C) to Homc(D,C). This is often written as 
yc = Homc(-,C), (E.62) 


and the functors yc are called representable presheaves. Since f : C — C’ induces 
a natural transformation yc — yc in the obvious way, i.e., its component Tp at D is 
the map g > fg from Homc(D,C) to Homc(D,C’), the map C++ yc extends to a 
functor y : C > [C°?, Sets], called the Yoneda embedding. 


Lemma E.15. For any F € [C°P, Sets], any D € Co, and x € Fo(C), the map 
a : Homc(D,C) + Fo(D); (E.63) 
(p 4 Cc) > Fi(f) (0), (E.64) 


where F\(f) duly maps Fo(C) to Fo(D), forms the component at D of a natural 
transformation 7) from yc to F, and the ensuing map x > ) gives a bijection 


Fo(C) = Homicer sets|(Yo.F). (E.65) 


Recall that by definition of a functor category, the right-hand side of (E.65) consists 
precisely of the set Nat(yc, F’) of natural transformations from yc to F. 


Lemma E.16. For any C € Co and S € Sieves(C), the presheaf X) defined by 


x9) (D) =Homc(D,C)NS; (E.66) 
x (D4D!) ={ef |gexs))}, (£.67) 


defines a subobject m: x) yc, and the ensuing map S ++ x) yields a bijection 
Sieves(C) + Sub(yc). (E.68) 


More generally, if X and Y (generalizing X (S) and yc above) are presheaves on C 
with X9(D) C Yo(D) for all D, then the equivalence class of X is a subobject of Y. 
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The proof of Lemma E.16 below uses the converse fact: any subobject of Y has a 
representative X’ for which X’ is a subfunctor of Y, i.e., Xj(D) C Yo(D) for all D, 
and Xy is the restriction of Y;, as in (E.70). below. To see this, suppose one has a 
mono m: X — Y, so that each component mp : Xo(C) — Yo(D) of m is an injective 
function. We now define a presheaf X’ on C by 


X4(D) = mp(Xo(D)) C Yo(D); (E.69) 
Xi(f) =" (fxn (Ff: D > D’). (E.70) 


Furthermore, we define a natural transformation m’ : X' + Y, whose components 
m'y : Xo(D) — Yo(D) are given by set-theoretic inclusion. The natural transformation 
h:X — X’, defined through its components hp = mp (where My iS mp, but seen as 
a map from Xo(D) to X4(D) rather than to Yo(D)) then renders m and m’ isomorphic. 


Proof. The map S +> X(S) has an inverse X + Sy, where Sy € Sieves(C) is given by 


Sy = U Xo(D). 
DeECo 


Combining (E.55) - (E.56) with Lemma E.15 applied to F = Q, gives 
Hom cop Sets] (Yc, 2) = Sieves(C), (E.71) 


so that Lemma E.16 yields a bijective correspondence between arrows from yc to 
Q. as defined in (E.55) - (E.56), and subobjects of yc. At D, diagram (E.50) is 


Homc(D,C)NS —=—> «* 


mo | i. 3 (E.72) 
Homc(D,C) aed, Sieves(D) 
where mp is the inclusion map, tp(*) = Smax(D), and (%m)p(f) = f*S. Commuta- 
tivity of this diagram follows from the fact that if f ¢ Homc(D,C)NS, then f*S 
is the maximal sieve on D, as trivially follows from the definition of a sieve. The 
pullback condition is then easy to verify from Lemma E. 16. 
If we replace yc by any presheaf Y, the classifying map 7¥,, is given by 


(%m)D : Yo(D) — Sieves(D); (E.73) 
xt>{f:D' +D|Y(f)(x) € Xo(D)}, (E.74) 
noting that Y;(f) maps Yo(D) into Yo(Z), and Xo(Z) C Yo(Z), since we assume that X 


represents a subobject of Y such that Xo(D) C Yo(D). This generalizes the previous 
case where Y = yc. To show that 7, is unique, we write down (E.50) at D: 
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Xo(D) ————> * 
mp ‘D . (E.75) 
Yo(D) ne, Sieves(D) 


Also, the condition that 7,, be a natural transformation implies that the diagram 


Yo(D) ole, Sieves(D) 
vi(f) e (E.76) 
(Xm) pl 


Yo(D’) ———> Sieves(D’) 


commutes for any f : D' + D. Then (E.75) with D ~» D’ implies that for any y € 
Yo(D’), we have y € Xo(D') iff (Ym) p(y) = Smax(D’). In particular, we may take 
y = (f)(x) for x € Yo(D), so that VY (f) (x) € Xo(D’) iff xp (Yi (f) (x)) = Smax(D’). 
Commutativity of the diagram (E.76) gives (%m)p' °VYi(f) = f* ° (Am)p, so that 
Yi(f)(x) € Xo(D’) iff f*((X%m)p(*)) = Smax(D’), which in turn is the case if and 
only if f € (%m)p(x). Hence we finally obtain 


Ff € (Am) p(x) iff M1 (f) (x) € Xo(D’), (E.77) 


which is the definition (E.73) - (E.74) of %m, and renders it unique (given m). 

Finally, the universal property of (E.50) follows from Proposition E.11: since 
if X’ in (E.50) is like P’ in (E.44), then m’ : X’ > Y is the pullback of x over ft, 
and hence m’ must be equivalent to m. But we know (cf. Definition E.10.2) that an 
equivalence between mono’s is unique. This closes the proof of Theorem E.13. 


Refining presheaves, we also introduce the category Sh(X) of sheaves on X, 
which is the full subcategory of [@(X )°?, Sets] defined by the following condition. 


Definition E.17. A presheaf F : O(X)°? — Sets on X is a sheaf if for any open 
U € OX), any open cover U = UU; of U, and any family {s; € Fo(U;)} such that 
Fy Uj < Uj)(5j) = Fi Uj S Uk) (5x), (E.78) 


for all j,k, there is a unique s € F(U) such that s; = F(U; <U)(s) for all j. 

Here U jg =UjNUg, and F\(V < W) : Fo(W) — Fo(V) is the arrow part of the functor 

F If F is a sheaf on X, then for each open U = U jc Uj, it has the continuity property 
Fo(U) = lim j Fo(Uj), (E.79) 


where the limit is defined with respect to the diagram D : J — Sets where J is the 
posetal category whose objects are j € J, and (i, j) € J x J provided U;; 4 0, ordered 
by i < (i,j) and j < (i,j), with D(i) = F (U;) and D(i, j) = F(U;;), ete. 

A key example of a sheaf on X is the sheaf of continuous functions, where 


Fo(U) =C(U,R). (E.80) 
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If U < V, then the associated map Fi(U < V): C(V,R) > C(U,R) is simply 
given by restriction. Sheaves may be defined far more generally (as done by 
Grothendieck), namely on a site (which is a category equipped with a so-called 
Grothendieck topology), but sheaves on a space are all we need in this book. 
Analogously to Theorem E.13, Sh(X) is a topos, whose truth object is the sheaf 


Q(U) = O(U); (E.81) 
Q\(U <V) =(-)nU, (E.82) 


ie., if W € O(V), then Q)(W) =WNU € O(U). With the terminal object in Sh(X) 
being borrowed from [@(X), Sets], i.e., lo(U) = *, its subobject classifier is 


ty (*) =U. (E.83) 


In fact, let X be a poset equipped with its intrinsic Alexandrov topology, whose 
open sets are the upper sets, i.e. those U C X for which x € U and x < y implies 
y €U. Examples of opens are up-sets U = +x = {y € X |x < y}, which form a basis 
of the Alexandrov topology; in fact, ¢x is the smallest open set containing x. For any 
x € X, we write Upper(x) for the set of all upper sets containing tx. 


Proposition E.18. If X is a poset, the category |X ,Sets] of functors F : X — Sets 
(where X is seen as a category defined by the underlying poset) is isomorphic to the 
category Sh(X) of sheaves on X (equipped with the Alexandrov topology), i.e., 


[X , Sets] = Sh(X). (E.84) 


Note that [X, Sets] consists of presheaves on X°? (in which x < y iff y <x in X). 


Proof. This isomorphism is given by mapping a functor F : X — Sets to a sheaf 
F : G(X)°P — Sets, by defining the latter on a basis of the Alexandrov topology as 


F(tx) =E(x), (E.85) 


extended to general Alexandrov opens by (E.79). Vice versa, a sheaf F on X imme- 
diately defines F by reading (E.85) from right to left. 


Corollary E.19. If X is a poset, the subobject classifier in |X , Sets] is given by 


Qo(x) = Upper(x); (E.86) 
Q(x <y) =(-)N (ty): (E.87) 
ty(*) = Tx. (E.88) 


Proof. If Cis a poset X, then a sieve on x € X is a lower subset of | x (1.e., ify ES 
then y < x, and if also z < y, then z € S). Recalling the comment after (E.84), the 
claim then follows from (E.55) - (E.59). Alternatively, using Proposition E.18, the 
claim also follows from (E.86) - (E.88). 
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E.3 Subobjects and Heyting algebras in a topos 


There are numerous connections between topos theory and intuitionistic logic, most 
of which generalize links between set theory and classical logic. The beginning of 
algebraic logic was Boole’s work, which in modern parlance structured the power 
set Y(X) of any set as a Boolean lattice, and hence provided a semantics for classi- 
cal propositional logic, cf. §D.2. From a categorical view, Y(X) is the set Sub(X), 
cf. (E.52) and subsequent text. This generalizes to any topos in which Sub(X) is a 
set (rather than a proper class), except for the decisive difference that Sub(X) is no 
longer a Boolean lattice but a Heyting algebra, making topos logic intuitionistic. 


Proposition E.20. For any object X in a locally small topos T, the set Sub(X) of 
subobjects of X is a Heyting algebra with respect to the partial ordering < defined 
by |[m:U + X] < [m': V > X] iff there ish: U > V such that mh =m. 


It is easy to show from Definition E.10.1 that < is well defined, and since it is 
defined “on the nose”, i.e., at the level of representatives of the equivalence classes 
in question, in what follows we will use mono’s rather than their equivalence classes. 


Proof. Since we only need this result for presheaf toposes, we just list the pertinent 
operations, and omit the verification of the details (which is left to the reader). 


e The bottom element | of Sub(X) is the unique arrow 0 > X, where 0 is the 
initial object in T (any category with finite colimits has such an object, denoted 
by 0, whose defining property is that for any X there is a unique arrow 0 + X; as 
the notation suggests, in Sets the empty set is an initial object). 

e The top element T of Sub(X) is the identity arrow idy at X. 

e The inf of m: UX and m': V —X is their pullback, i.e., abusing notation, 


TAY 
a| i (E.89) 


m 


| ene Os 


so that the desired arrow U \V — X is mp = m’'q (which is indeed a mono). 

e The sup of m: U — X and m' : V — X is more complicated. In any topos T, 
arrows have an epi-mono factorization f = me, where m is mono (and as such is 
unique up to isomorphism), called the image of f, and e is epi. Furthermore, T 
has finite colimits including coproducts. Reversing all arrows in (E.32) gives 


ot pa (E.90) 


The sup U VV, then, is “the” image of the arrow f in this diagram. 
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e Finally, implication --> is defined in terms of an equalizer, which may be con- 
structed as a pullback, as follows: taking Y = Z = X and q; = q2 = idx in 
(E.32) gives a unique arrow Ay : X — X x X, called the diagonal; in Sets it is 
Ax (x) = (x,x). Furthermore, if we have two arrows f,g:X — Y, taking Z ~~ X, 
X ~~ Y,qi ~ f, and q2 ~ g in (E.32) gives a unique arrow (f,g):X >Y xY, 
which in Sets is of course given by (f,g)(x) = (f(x), g(x)). 

The equalizer of f and g, then, is the arrow e: E — X in the pullback 


x 
far (E.91) 
(Ff, 8) 


—> Yx. 


The equalizer indeed deserves its name, because the map p equals both fe and ge; 

in Sets, E C X may simply be taken to be the subset on which f and g coincide. 
We return to our monos m: U —> X and m': V > X, with inf U AV: the mono 

(U --+ V) —X is the equalizer of the classifying maps ¥y,Zuav :X > Q. 


Recall that in Sets we may identify Sub(X) with the power set A(X), so that 


L=0: (E.92) 
pe. (E.93) 


For U,V C X, the above constructions reduce to the well-known expressions 


U<ViffU CV; (E.94) 
UAV =UNV;: (E.95) 
UVV=UUV; (E.96) 
U --+ V=U°UV, (E.97) 


where for comparison below we may rewrite the right-hand side of (E.97) as 


USUV = {xEx |xeU Sx EV}. (E.98) 

The (derived) expression (D.12) for negation then equates — with complementation: 
U =US= {xEx |x¢U}. (E.99) 

In a presheaf topos [C°P, Sets], one obtains similar expressions for | and T, viz. 


1o(C) =0; (E.100) 
To(C) =x(C), (E.101) 


where the functor L is the initial object in [C°?, Sets]. The logical connectives resem- 
ble the set-theoretic case, too, except for the last ones: if U and V are representatives 
of subobjects of X such that Uo(C) C Xo(C) and Vo(C) C Xo(C), we have: 
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U <V iff Uo(C) CVo(C) for all C; (E.102) 
(UAV )o(C) = Uo(C) NVo(C): (E.103) 
(UV V)o(C) = Uo(C) UVo(C); (E.104) 


(U --+ V)o(C) = {x € Xo(C) | VD AO: Xi (f)(x) € Uo(D) = Xi (f) (x) € Vo(D)}; 
(E.105) 


=Up(C) = {x € Xo(C) | VD & C: Xi (f)(x) € Uo(D) = Xi (f)(x) € Vo(D)}.- 
(E.106) 


This Heyting algebra is Boolean iff ~=U = U for each U, so we are interested in 


4-Up(C) = {x € Xo(C) | VD 4 CAE 4 D: Xi (gf)(x) CUp(E)}. E107) 


It can be shown that Sub(X) is Boolean for each object X iff Sub(.Q) is Boolean. In 
order to settle this, we specialize (E.107) to subfunctors m : U + Q, which gives 


4Up(C) = {S € Sieves(C) | VD & C3E 4s D: (gf)*S CUo(E)}.  (E.108) 


For example, if C = X°? is a posetal category, this expression becomes 


=Up(x) = {8 © Upper(x) | Vy > x4dz>y:SN(tz) € Uo(z)}, (E.109) 


which is clearly an additional property of S € Up(x); examples abound in Chapter 
12. Thus the (propositional) logic of Sub(X) may be genuinely intuitionistic (and 
given our examples, this conclusion especially applies to quantum logic). 

Although X is an object in a topos, Sub(X) is a Heyting algebra in ordinary set 
theory. This is called an external description of X. Alternatively, one may study a 
topos using so-called internal reasoning. We will develop the logical foundation of 
internal reasoning (at least to some extent) in the next section, and for the moment 
just look at a special example, namely Heyting algebras within some given topos. 


Definition E.21. Let T be a topos (more generally, a category with all finite limits). 
e A preorder on an object X € Ty in T isamono me : R— X x X for which: 


1. The diagrammatic version of reflexivity (in set theory: x < x ) holds, as fol- 
lows. The diagonal Ax = A: X — X x X factors through me, i.e. there is an 
arrow X — R such that the following diagram commutes: 


cee [ms (E.110) 


2. The diagrammatic version of transitivity (in set theory: x <y and y < z imply 
x <z) holds, as follows. First, define P as the pullback 
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p 


my 


——R 
[rome (E.111) 
p2om< X 


2 


where pi, p2:X x X — X are the arrows in (E.32), and p,q are defined as in 
(E.44). The arrows ppom<op:P—-X and ppom<cog:P—-X, then yield an 
arrow P: X —> X xX via (E.32), which must factor through m<, too. 


e A partial order on X is a preorder that is antisymmetric (in set theory: x < y 
and y <x imply x = y), in the following sense. First, define the twist map 


t:XxX 3XxX (E.112) 


by taking Z ~» X x X, Y ~» X, q, ~ p2 and qz ~» p, in (E.32); in set theory, this 
would be T(x,y) = (y,x). This enables us to reverse the order by defining a monic 


ms:R>X xX; (E.113) 
m> =ToOM<, (E.114) 
with associated pullback 
Pio" +R 
d! [ms (E.115) 
m> 
R —+ Xx 


The arrow m< 0 p' =ms oq! : P’ - X, then, must factor through A: X + X x X. 
e A lattice in T is a partial order on some object X for which there are arrows 


REX KKK (E.116) 
Vix XXX: (E.117) 


such that: 


1. The arrow me : R—+ X XX is an equalizer of the arrows \: X x X — X and 
pi:X xX — X (in set theory this expresses the property x < y iffx \y =x). 

2. One has \o A = idx and V oA = idx (i.e., x \x =x and x x =x). 

3. The following square (stating that x \ (y Vx) = (xAy) Vx = x) commutes: 


es ee 6 


ri] Tiaexv 


XxX —°5XxxXxXx (E.118) 


ri [>xiax 


X <—,—— Xxx. 
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Here the middle arrow c is the composition 


Xx x PM x x XK) xX 3 Xx (XxX) ES XxXxX. (E119) 


e Let1 be ‘the” terminal object in T, with associated arrow X — X x 1 from (E.32), 
with Z ~» X, q, ~» idx, and Y ~ 1. A top element in an internal lattice is an 
arrow 1 :1-+X such that the following composite arrow is the identity idx: 


X Ss xxV ey xxx sx. (E.120) 


A bottom element is an arrow :1— X for which the following arrow is idx: 


Xe a ee eS (E.121) 


e A Heyting algebra in T is a lattice X with T and L, endowed with an arrow 
--2 XxX > X, (E.122) 


such that the monos m, and mz in the double pullback diagram 


P, >Ré¢ Py 
m | me [on (E.123) 
Xx XxX DY yx OS xXx 


are equivalent (and hence define the same subobject of X x X x X). 


The reader may check that in Sets these definitions reduce to the usual ones; as one 
can clearly see, finding diagrammatic versions of familiar definitions is an art! 
The most important example of an internal Heyting algebra in a topos is Q. 


Theorem E.22. The truth object Q in a topos T with subobject classifier t:1— Q, 
is a Heyting algebra in the partial ordering m< : R— Q x Q defined as the equalizer 
of the projection py : Q x Q — Q and the classifying map X44): Q x Q —+ Q of 
the product arrow (t,t): 1— Q x Q derived from t : 1 — Q. In particular: 


1. The inf arrow \: Q x Q — Q equals the classifying map X17) of (t,t). 
2. The sup arrow V : Q x Q — Q is the classifying map Xu of the arrow (see below) 


(t,idg)U (idg,t) : (1x Q)U(Q x1) 3 Ax Q;. (E.124) 


3. The implication arrow --+: Q x Q — Q is the classifying map Xm— of m<. 
4. The top element T : 1+ Q coincides with the subobject classifier t : 1 — Q. 
5. The bottom element | :1— Q is equal to the classifying map Xo of 0 — 1. 
6. The negation arrow 7: Q — Q equals the classifying map x, of _:1— Q. 


For every object Y € To, this structure makes Homy(Y,Q) an external Heyting 
algebra (i.e., in Sets), such that (E.51) is an isomorphism of Heyting algebras. 
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We omit the proof of Theorem E.22, which is a straightforward verification. 
In no. 1, the arrow (t,t) is a special case of the arrow (f,g) defined just before 
(E.91). In no. 2, we need the following construction, applied to the arrows 


(t,idg) : (1x Q) > (Q x Q); (E.125) 
(idg,t) :(Q x1) 3 (QxQ), (E.126) 


To define maps like the one in (E.124) in general, recall the the coproduct diagram 


ie ow (E.127) 


which is just the opposite of the product diagram (E.32). In particular, for any given 
mono’s m, :X — Zand m2: Y > Z, we obtain a unique map 


(m,,m2):X+Y > Z. (E.128) 
The image of the latter in the sense of its epi-mono factorization (m1 ,m2) = me, i.e. 
(m,,mz):X +¥ >» XUY 4 Z, (E.129) 


is the mono denoted by mUm! : X UY > Z (which is called m in the above diagram). 
In no. 5, 0 is the initial object in T. Note that the truth arrow t: 1— Q is the 
same as the classifying map Zia, of the identity arrow id, : 1 — 1, so that all arrows 
in Theorem E.22 are classifying maps. 
In the presheaf topos [C°P, Sets], where C is any category, products are taken 
pointwise, and also, set-theoretic intersection commutes with pullback of sheaves: 


f' (SNS) = fr (S)Of*(S') (f : DC; S,S' € Sieves(C)). (E.130) 
These facts imply that the component at Ac of the natural transformation / is just 
Ac(S,S') =SNS" (S,S' € Sieves(C)), (E.131) 
which in turns implies that if R is taken to be a subfunctor of Q x Q, so that 
(m<)c : Ro(C) — Sieves(C) x Sieves(C) (E.132) 
is the inclusion map, we have (S,S’) € Ro(C) iff S C S’. We also find 


--3¢ (S,8') ={f :D3C| f*S C f*s}; (E.133) 
ac(S) ={f:D>C|f ES}, (E.134) 


which are easily checked to be natural in C. Finally, the top element Tc € Sieves(C) 
is the maximal sieve, and similarly the bottom element c¢ is the empty sieve. 
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E.4 Internal frames and locales in sheaf toposes 


As we have seen in §D.1 as well as in §C.11, a complete Heyting algebra is the same 
qua lattice structure as a frame, except that maps between frames are defined differ- 
ently: a frame map is required to preserve order and arbitrary suprema, whereas a 
Heyting algebra frame map preserves order and implication. Furthermore, one has 
locales, which are frames, too, except that maps go in the opposite direction. Hence 
if Frm is the category of frames (within Sets), then the category Loc of locales is 


Loc = Frm°?. (E.135) 


We also recall the bizarre (but wonted) notation X for an object in Loc that is the 
same as the object denoted by @(X) in Frm, where nothing is implied about the 
spatiality of the frames in question (i.e., it is not necessarily the case that there is 
an actual space X of which the given frame called @(X) is the topology). In the 
same spirit, frame maps are written f* : O(Y) > O(X) or f-!: O(Y) + G(X), the 
corresponding locale map being f : X — Y (which is the same map between the 
same objects), once again, even if no space X in the usual sense is around. 

In any case, in order to define internal frames, locales, or complete Heyting al- 
gebras in a topos, one must define completeness of internal lattices. This is difficult 
diagrammatically, but it can be done through the internal language of 8E.5, e.g. by 


EF Vsax(S CLx)AVy(S Cly ax <y), (E.136) 


where S C X and x,y € X (technically, S is a variable of type Q*, and x and y are 
variables of type X, see 8E.5). We may avoid this, however, since due to the iden- 
tification (E.84) in Chapter 12 we can work in a sheaf topos Sh(X), where internal 
frames have a simple external description, as follows: there is an equivalence 


Frmspx) & (Frmsets/O(X ))°P (E.137) 


between the category of internal frames in Sh(X) and the category of frame maps in 
Sets with domain @(X), where the arrows between two such maps 


ty! : 6(X) > GY); (E.138) 
mz! : O(X) > 6(Z); (E.139) 
are the frame maps 
g !:6(Z) > 6(Y) (E.140) 
that satisfy 
Oey Hae (E.141) 


This looks more palpable in terms of the “virtual” underlying spaces (i.e. locales): 
If (E.138) - (E.140) are seen as inverse images of maps my :Y > X ,%z:Z— X, 
and @: Y — Z, then the condition (E.141) corresponds to the equality 7z0 @ = my. 
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To explain the equivalence (E.137), we underline locales in Sh(X), writing Y 
etc.; the corresponding internal frame is denoted by @(Y) (which is the same object 
in Sh(X) as Y). The external description of Y in Sets, then, is a continuous map 


m:Y—->X, (E.142) 
where Y is a locale in Sets (in which X was a a space to begin with), with frame 
O(Y) = G(Y)(X). (E.143) 
Also here, the notation z : Y + X is purely symbolic, and stands for a frame map 
n!: 6(X) > 6(Y), (E.144) 
from which one may reconstruct Y as the sheaf 
OY) :U6 {VE G(Y)|V<a!U)} UE G(X)). (E.145) 


The frame maps (E.138) - (E.140) yield an internal frame map g! : O(Z) > G(Y) 
in Sh(X), which is a natural transformation, by defining its components as 


g 'U):Laz'(U) aL ay! (U); (E.146) 
SH@l(S). (E.147) 


As an application, the Dedekind real numbers R can be axiomatized by what is 
called a geometric propositional theory T. In any topos T (with natural numbers 
object), such a theory determines a certain frame O(T)+1, whose “points” are defined 
as frame maps O(T) — Q, where Q is the subobject classifier in T (more precisely, 
the object of points of G(T) in T is the subobject of 2°) consisting of frame 
maps). If Tp is the theory axiomatizing R, in Sets one simply has the frame 


O(Tr)sets = @(R), (E.148) 


whose points are R. More generally, if T is some geometric propositional theory, and 
X is a space with associated sheaf topos Sh(X), then the internal frame O(T)shyx) 
is given by the sheaf (E.145) defined by taking the frame map (E.144) to be the 
inverse image map 2%! = ap! : O(X) > O(X x O(T)sets) of the projection zp : 
X x O(T)sets > X onto the first component. Using (E.148), this yields the frame of 
Dedekind real numbers G(R) = @(Tg) ina sheaf topos Sh(X) as the sheaf 


@(R)snx):U + GU XR). (E.149) 
The Dedekind real numbers object, on the other hand, is given by the sheaf 
(R)snx) :U ++ C(U,R). (E.150) 


Using (E.85), such results may immediately be transferred to T(A), see §12.1. 
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E.5 Internal language of a topos 


The internal language (also called Mitchell-Bénabou language) of a given topos 
T looks like a first-order language, except that it is typed (i.e., many-sorted), in that 
each term o has a certain type, written o : X, indexed by the objects X of T. For 
example, formulae (by definition) have type Q. In addition, symbols, terms, and 
formulae have a list FV(o) of free variables. Furthermore, the internal language has 
a canonical model in which it may be interpreted, whose carrier is T itself. We often 
make no difference in notation between o as an element of the internal language of 
T and its interpretation [[o]] in T, which is some arrow in T; the two are so closely 
interwoven that making such a difference would be very artificial. Here are the rules. 


e Constants c of type X correspond to arrows c: 1— X (and so in Sets are elements 
of X) and have no free variables, i.e. FV(c) = @. Here and in what follows, we 
write ‘corresponds to’ in the following sense: for each arrow c: 1 — X there is a 
constant c of type X, and the interpretation of this constant is this arrow. 

e Logically interesting constants are the subobject classifier t : 1 > Q, which as in 
Theorem E.22 we often write as T, and its antipode | : 1 — Q, defined as the 
classifying map for the mono 0 — 1 (where 0 is the initial object in T). 

e Variables x of type X correspond to the identity idy : X + X, with FV(x) = {x}. 

e Function symbols f of type Y correspond to arrows f : X — Y. Thus in addition 
to its type, f has a source (namely X). Arities are unnecessary here; we may 
take Y =Zx--- x Z, with 1 terms, and say that f has source Z with arity n, but 
this is superfluous. Similarly, predicate symbols P would be function symbols of 
type Q*, and hence they are redundant (clearly, even constants and variables are 
special cases of function symbols, but it is useful to keep them apart). 

e Terms are built by iteratively applying the following formation rules: 


1. Constants and variables are terms of the given type. 

2. If t: X is a term of type X, and f : X — Y is a function symbol, then f(t) is 
a term of type Y, and FV(f(t)) = FV(t). Furthermore, [[f(7)]] = fot = ft. 

3. If we have n terms 7; : X; (i= 1,...,”), with FV(t) =--- =FV(t,) = F, then 
(T1,..-,T,) is aterm of type X, x --- x X, and FV(t),...,%) =F. 

If 7; has interpretation 7; : Y — X;, then (7),...,T) : ¥ 4X1 x--- x X, is the 
corresponding product arrow, as defined (for n = 1) before (E.91). 

4. One may add free variables to terms; if t : Z with interpretation tT : X — Z has 
a single free variable x : X, and we add a free variable y, then the interpretation 
of the revised term 7! with FV(t') = {x,y} is 7: X x ¥ 25 X 4Z etc,). 

5. From 7: X with FV(t) = {z1,...,Zn} with z;: Z;, andn terms 0; : Z;, all having 
the same free variables FV(0;) = {y1,...,Ym}, with y;: Y;, we can form a new 
term T(01,...,O,) of type X (i.e. the same type Tt had), with free variables 


FV(t(01,...,0n)) = {V15---sYm}- (E.151) 


As the notation suggests, the interpretation of T(0,...,0,) is TO(O1,...,On). 
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e A formula is a term of type Q. A sentence is a formula without free variables, 
which is therefore interpreted as an arrow @: 1 + Q. The rules for formulae are: 


1. Let @ be a formula with FV(@) = {x,y}, with x: X andy: Y. As in first-order 
logic, we may write @ as ~(x,y). Then {x | @(x,y)} is a term of type Q*, with 


FV({x| PO,y)}) = {y}- (E.152) 


This rule implements the isomorphism (sometimes called A-conversion) 
Homy (X x Y,Q) & Homy (Y,Q*), (E.153) 


which follows from the existence of exponentials in a topos. Indeed, (E.153) 
turns the interpretation g : X x Y > Q into an arrow {x | p(x,y)} : ¥ > Q*. 
Similarly, from @ : X x Y > Q we obtain a term {(x,y) | (x,y)} of type 
X x Y, which is none other than the subobject classified by @. 

Taking Y = 1 to be the terminal object and using (E.51), we see that 


Homy (X,Q) © Sub(X) Y Homy (1,.2%*) , (E.154) 


which shows that Q* plays the role the power set A(X) of X plays in Sets. 
2. If o : Y and tT: Y are terms with the same free variables, then 0 = T is a 

formula having the same set of free variables as t and o. If 0: X + Y and 

t:X —Y, then the interpretation [[o = t]] : X > Q is the composite arrow 


(9,1) 


x <8 yxy So, (E.155) 


where =y is the classifying map of the diagonal Ay: YY x Y. 
3. If t: ¥Y and o: Q” are terms with the same free variables, then t € o is a 
formula with the same free variables. If 7: X > Y ando : X > Q’, then 


[[c< oJ]: x E8y xa’ a. (E.156) 


4. As in first-order (or propositional) logic, new formulae may be made from old 
ones using the logical connectives A, V, >, and —. To interpret such com- 
posites, it is convenient to assume that their components have the same free 
variables, which can always be achieved using rule 4 for term-building above 
(i.e, by adding free variables). So let p : X + Q and yw: X > Q be (interpre- 
tations of) formulae, and let e be either A, V, or —>. We then define 


[pew]: x Vaxasa, (E.157) 


where the arrow e : Q x Q — Q is defined from the Heyting algebra structure 
on 2 described in Theorem E.22. Similarly, negation is given by 


[Fe] :X 3 2 >. (E.158) 
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5. If a formula @(x,y) contains x freely, as well as other free variables collec- 


tively called y, then 4,@(x,y) is a formula, whose interpretation we now give, 
after a bit of preparation. First, consider the commutative diagram 


P — > Z — > 1 
fim m t (E.159) 


X f Yy Xm Q 


where m is a mono, so that its equivalence class defines an element of Sub(Y). 
Taking the pullback of either m and f, or, equivalently, of t and 7,0 f, we 
obtain a monic f*m: P — X, whose equivalence class is an element of Sub(Y). 
Consequently, any arrow f : X — Y induces a map 


f* : Sub(Y) > Sub(X), (E.160) 


which is a homomorphism of (external) Heyting algebras (i.e. in Sets). For 
example, in Sets, where Sub(X) may be identified with Y(X) (see comment 
after (E.52)), the map f* : A(Y) + A(X) is simply the inverse image f—! of 
f. If we regard the lattices Sub(X) and Sub(Y) as posetal categories, the map 
f* has both a left-adjoint and a right-adjoint, denoted by 


dp: Sub(X) — Sub(Y); (E.161) 
Vp: Sub(X) > Sub(¥). (E.162) 


To justify this suggestive notation, replace X by X x Y and take f:X x Y — Y 
to be pp (i.e., projection on the second space). Hence this gives maps 


Ap, : Sub(X x Y) — Sub(Y); (E.163) 
Vp, Sub(X x Y) + Sub(¥). (E.164) 


In Sets, we identify the Heyting algebras Sub(X x Y) and Sub(Y) (now 
Boolean) with A(X x Y) and A(Y), respectively, and obtain (on A C X x Y): 


p (A) = {y EY | drex : (x,y) € A}; (E.165) 
(A) = {y EY | Vex : (x,y) € A}. (E.166) 


<< Ww 


Returning to a general topos, given @ : X x Y > Q, the diagram 


— 


<— {(x,y) 1 e@y¥)} —> Jn ({@y) | P@y)}) —> 


1 
i | | : 
es oe p2 Oy [Bee 0 


re 


defines the interpretation [[4,@(x, y)]] (with innocent abuse of notation in ap- 
plying the map 4, ). The interpretation of V,@(x,y) via V,, is quite similar. 
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We now define the (semantic) notion of truth for sentences in the internal lan- 
guage of a topos; this is a far-reaching categorical generalization of the idea initially 
studied in the straightforward context of propositional logic, cf. §D.2. 


Definition E.23. 7. A sentence @ in the internal language of T is true, written |F @, 
if its interpretation [|@]] coincides with the subobject classifier t :1— Q. 

2. An open formula @ (x) is true if its interpretation ||@(x)]] :X — Q factors through 
t, or, equivalently, if (the interpretation of) {x | p(x)}, seen as the subobject of X 
classified by @ (as explained between (E.153) and (E.154)), is X itself. 


The two clauses of this definition are actually equivalent, since no. | is obviously a 
special case of no. 2 by omitting the free variable x (and hence taking X = 1), but 
also, the second reduces to the first, because @(x) is true iff V(x) is true. 

As arefinement of this concept of truth, for [[@(x)]] : X — Q as above, which we 
simply write as @ : X — , take an arrow f : Y — X. By definition: 


Y | o(f) means IF go f. (E.167) 


If @ is a sentence (i.e. X = 1), this means that Y lt @(f) iff @ = vy (in other words, 
@ classifies Y — 1). There are (at least) two applications of this idea: 


e The notion of partial truth states that @ is true at stage Y if Y |t o(f). 

e We say that a set Y C To of objects generates T if for every pair of parallel 
arrows f :X — Y andh: X —Y, the property fg = hg for all arrows g:G— X 
from all objects G € Y implies f = g. For example, any singleton generates Sets, 
and for any category C, it can be shown that the functors yc generate the presheaf 
topos [C°P, Sets], as C runs through Co. In general, it then follows that |+ @ holds 
iff Gl @ for all G € Y (and, implicitly, all arrows G > X). 


Both play a role in Chapter 12, in the case where T = [C°P, Sets] for a poset C. By 
the Yoneda Lemma E.15, any arrow a’ : yc — X bijectively corresponds to some 
element a € Xo(C). In that case, we write C+ g(a) for Cl+ @(a’), which by (E.167) 
that the arrow goa’ : yc — Q factors through the subobject classifier t: 1 Q. 


Kripke-Joyal semantics unfolds the expression Y |+ @(f) by looking at the for- 
mula @ in terms of its constituent terms. As one sees in Chapter 12, since this pro- 
cedure may be used iteratively, it is extremely useful for computational purposes. 

Although more than we need (which is the posetal case), we now give the rules 
for the validity of C |- @(a) in an arbitrary presheaf topos [C°P, Sets], as just men- 
tioned; the posetal case follows in that f : D — C can only mean D < C. 

We use the following notation: 


e In clauses 1-4 below, we assume @ : X > Q, and also yw: X > Q (as already 
noted, this can always be achieved by adding free variables to @ and/or y). 

e In5-6, we assume @: X x Y + © so as to accommodate the free variable y: Y. 

e In7and 8, we have t:X 3 Y, witho: X SY inno. 7, ando: X 3 Q’ in8. 


We then have the following forcing rules, which generalize the ones given at the 
end of §D.3, and should be seen as theorems of categorical logic and topos theory: 
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Cl- o(a) A w(a) iff Cl+ p(@) and Clk w(a). 

CIF O(a) V w(q) iff Cl+ g(a) or Clk w(a). 

Cl- g(a) > w(q) iff Dlt (af) implies Dl w(af) for each f: D> C. 

C|- ~@(q@) iff no arrow f : D— C exists such that D | w(af) holds. 

Clk 4,@(y)(@) iff there exists B € Yo(C) such that C lt g(a, B), where 
0(a,B):C + Q:; (E.168) 
¢(a,B) =go(a’,p’). (E.169) 


is obtained by combining the maps a’ : yc + X and B’: yc + Y into 


(a, B'): yc 3X XY. (E.170) 


If @ has no free variables except y, then C Ik 4, @(y) iff there is B € Yo(C) such 
that C Ik p(B). 


. CIE VY, @(y)(@) iff DIF (af, B) for each f : DC and each B € Yo(D). 


Here the arrow f : D> C induces a natural transformation f” : yp > yc, yielding 
af =a'o f’: yp — X, which combines with B’ : yp > Y to 


(af,B): yp +X xY. (E.171) 
Similarly to the previous case, If @ has no free variables except y, we have 
CIF Vy@(y) iff DIF p(B), (E.172) 


for each f : D— C and each B € Yo(D). 


.ClK(t=0)(a) iff toa’ =a0a’, 


. Clk (t € o)(q) iff the arrow 


(coa’,toa’):yo + Q’ xY (E.173) 


factors through the subobject of Q” x Y that is classified by the evaluation map 
ev:Q°xY>5Q. Asa special case, take Y ~» 1 and hence tT : X — 1, so that 


o:X3>Q'=a (E.174) 


corresponds to a subobject S — X (i.e. classified by o = 7). The above subobject 
of Q! x 1 & Q is then simply given by the truth arrow t : 1 + Q. Writing x € S 
for T € o (where x: X is a variable of type X), we therefore obtain the rule: 


. Clk (x €S)(Q) iff coM: ye + Q factors through t (in other words, the subobject 


of yc classified by 6 0 @ is yc itself). 
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Notes 


The standard introduction to category theory by one of its founders is Mac Lane 
(1998); see also the book by his student Awodey (2010), as well as the lecture notes 
by van Oosten (2002) and Cheng (2002). A nice book, which studies set theory 
from the point of view of category theory, is Lawvere & Rosebrugh (2003). At high- 
school level, see also Lawvere & Schanuel (1997) or (informally) Cheng (2015). 
Toposes were invented by Grothendieck in the early 1960s as part of his rebuild- 
ing of algebraic geometry; see Artin, Grothendieck, & Verdier (1972). The history 
and philosophy of category theory (including topos theory) has been described by 
Kroémer (2007) and by Marquis (2009); for categorical logic see also Marquis & 
Reyes (2012) and Bell (2005). According to a leading category & topos theorist: 


‘category theory was the objective form of dialectical materialism (...) set theory was con- 
sidered to be essentially bourgeois since it is founded on the relationship of belonging.’ 
(Marquis & Reyes, 2012, p. 30). 


Books on topos theory and categorical logic we used include (in increasing order 
of scope and sophisitication): Goldblatt (1984), Bell (1988), Borceux (1994), Mac 
Lane & Moerdijk (1992), and last but not least, the encyclopedic Johnstone (2002). 


E. 1. Basic definitions 

von Neumann—Bernays—Gddel set theory is discusses in some detail in Mendel- 
son (2010); for algebraic set theory see Joyal & Moerdijk (1995). Category theo- 
rists also typically rely on the notion of a Grothendieck Universe, see e.g. Mac Lane 
(1998, §1.6), Marquis (2009, §5.5), and Krémer (2007, Ch. 6). 


8E.2. Toposes and functor categories 

An axiomatization of Grothendieck’s toposes (and certain generalizations thereof) 
equivalent to Definition E.12 was given in 1970 by Lawvere and Tierney (it seems 
to have been customary among the pioneers of topos theory, who also include Joyal, 
not to publish their findings too lavishly and in fact no joint paper by Lawvere & 
Tierney recording their definition seems to exist at least in the open literature). 


8E.3. Subobjects and Heyting algebras in a topos 
See Mac Lane & Moerdijk (1992), 881.8, IV.8, and Borceux (1994), §1.2. 


$E.4. Internal frames and locales in sheaf toposes 
The external description of internal locales in sheaf toposes originates with Joyal 
& Tierney (1984); see also Johnstone (2002), §C1.6. 


8E.5. Internal language of a topos 

More details and proofs of the Kripke—Joyal semantics for the internal language 
of a topos may be found in Bell (1988), Ch. 4, Mac Lane & Moerdijk (1992), 8IV.6, 
Borceux (1994), §6.6, and Johnstone (2002), §D1.2. 

For an analysis of the notion of partial truth (as defined here) applied to quantum 
mechanics (differently from our Chapter 12), see Butterfield (2002). 
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bounded, 538, 569 
closable, 177, 572 
closed, 177, 571 
closure, 177, 572 
compact, 608, 609 
density, 622 
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domain, 570 
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maximal, 598 
multiplication, 571 
norm-positive, 583 
normal, 583 
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polar decomposition, 510 
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self-adjoint, 177, 569, 570 
square root, 617 
symmetric, 573 
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unbounded, 569 
unitary, 506, 510, 566 
order interval, 777 
order isomorphism, 127 
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strong, 380 
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subset of a set with a transition 
probability, 32 
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orthocomplement 
of a subset of a set with a 
transition probability, 32 
orthocomplementation, 779 
orthodoxy, 435 
orthogonal complement, 499, 562 
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orthomodular lattice, 78 
orthonormal basis, 497 
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of a set with a transition 
probability, 32 
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Outcome Independence, 218 
outcome spaces, 448 
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Paradox of Probability, 310 
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paraparticle, 278 
parastate, 278 
parastatistics, 276, 278 
Parseval’s equality, 565 
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partial order, 777 
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partition function, 356 
partition of unity, 705 
Pauli matrices, 130 
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paving conjecture, 118 
Peano Arithmetic, 793 

axioms, 797 
perfect anti-correlation, 215 
perfect correlation, 203, 215 
Peter—Weyl Theorem, 153 
phenomenological theory, 367 
Plancherel’s Theorem, 773 
Poincaré group, 272 
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point (of frame), 491 
Poisson algebra, 88 
Poisson bracket, 88 
Poisson derivation, 96 
Poisson geometry, 84 
Poisson manifold, 88 
Poisson tensor, 89 
polar decomposition, 510 
polar of subset, 244 
polarization identity, 496 
Pontryagin dual, 173 
Pontryagin duality, 720 
Pontryagin Duality Theorem, 173 
poset, 777 

directed, 777 
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inner product, 495 

metric, 516 

norm, 495 
Positive Operator Valued Measure, 
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of map on B(H) xa, 43 

of map on C(X), 30, 526 

of quantization, 295 
potential, 348 
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POVM, 74 
pre-inner product, 495 
predicate logic, 793 
predicate symbols, 793 
predual, 744 
preorder, 777 
in topos, 822 
presheaf, 815 
on X, 815 
representable, 816 
Principle of General Tovariance, 493 
Principle of the Identity of 
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Principle of Uniform Boundedness, 
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probability distribution 
on &(H), 72 
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probability measure 
K-exchangeable, 306 
finitely additive, 533 
completely additive on Y(H), 
119 
exchangeable, 306 
finitely additive on A(H), 119 
on Y(H), 59, 119 
on locale, 489 
permutation-invariant, 306 
probability space, 104, 523 
non-commutative, 696 
problem of outcomes, 448 
problem of statistics, 446 
product, 811 
binary, 811 
Product Extension (assumption), 222 
projection, 499, 502, 573 
atomic, 601 
finite, 750 
minimal, 750 
projections, 125, 333 
proof by contradiction, 787 
proposition, 784 
propositional logic, 784 
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normal, 125 
pure thermodynamics phase, 363 
purely logical symbols, 784, 793 
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quadratic form, 496 
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436 
quantum De Finetti Theorem, 301 
quantum event, 40, 103 
quantum Ising chain, 348 
quantum Ising model, 348 
quantum logic, 459 
Birkhoff—von Neumann, 75-79, 
81, 459 
intuitionistic, 471-475 
quantum probability distribution, 40 
quantum random variable, 103 
quantum spin systems, 318 
quantum toposophy, 459 
quantum-mechanical law of strong 
numbers, 314 
quasi-linear, 121 
quasi-local observables, 318 
quasi-local sequence, 303 
quasi-state, 61, 490 
strong, 120 
weak, 120 
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reading scale, 447 
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Dedekind, 461, 489 
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upper, 489 
real rank, 758 
reductio ad absurdum, 787 
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regular space, 83 
relation, 777 
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representation, 36 
admissible, 174, 176 
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induced, 263 
irreducible, 153, 693 
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nondegenerate, 695 
of C*-algebra, 691 
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primary, 319 
skew-adjoint, 155, 188 
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representations 
disjoint, 319 
equivalent, 164, 319 
equivalent admissible, 176 
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in a Set with a transition 
probability, 32 
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Riesz Lemma, 563 
Riesz Representation Theorem, 526 
Riesz—Fréchet Theorem, 568 
right translation, 153 
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o-algebra, 523 
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Schatten—von Neumann ideals, 643 
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Schrédinger, E., 1, 2, 248, 249, 252, 
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Schur duality, 277 
Schur’s Lemma, 153, 693 
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self-adjoint operator 
maximal, 506 
self-adjoint operators, 125, 333 
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self-consistency equation, 414 
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propositional logic, 784 
semi-direct product, 256 
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semi-direct product algebroid, 260 
semi-direct product groupoid, 259 
seminorm, 178 
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fundamental lemma, 533 
sentence, 794 
in topos, 829 
separating duality, 546 
separation theorem, 544 
sequentially complete, 575 
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bounded, 576 
set with a transition probability, 31 
set-theoretic universe, 802 
setting of experiment, 199 
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Sheffer stroke, 787 
shift operator, 392 
sieve, 815 
maximal, 815 


878 


pullback, 815 
simplex, 28, 561 
Choquet, 560 
SNAG-Theorem, 724 
Sobolev Embedding Theorem, 182 
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o-compact, 527 
as a groupoid, 259 
compact, 83 
Hausdorff (= 7>), 83 
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sober, 687 
Stone, 748, 761, 780 
stonean, 748, 761 
totally disconnected, 780 
totally separated, 761 
spectral mapping property, 580, 659 
spectral order, 486 
spectral presheaf, 494 
spectral projection, 501, 588 
spectral radius, 578 
formula, 578 
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in a set with a transition 
probability, 33 
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spectral theorem for self-adjoint 
operators 
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bounded measurable functional 
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continuous functional calculus, 
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for unbounded operators, 633 
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space, 500 
spectral theory, 515 
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discrete, 582 
joint, 504 
point, 582 
residual, 641 
Spehner—Haake model, 453 
spin, 160, 175, 272 
spontaneous symmetry breaking, 
345, 367-433 
double well, 371-378 
mean-field theories, 409-415 
quantum spin systems, 379-385 
state, 30 
K-exchangeable, 301 
7;-normal, 319 
clustering, 322 
coherent, 252, 371 
correlated, 243 
equilibrium, 345 
ergodic, 365 
Gibbs, 384 
ground, 345, 350, 353, 355 
infinitely exchangeable, 301 
KMS, 359 
local equilibrium, 356 
macroscopic, 324 
mixed, 31 
normal, 109 
on B(H), 43 
on B(A) sa, 43 
on C(X), 527 
on Co(X), 84, 529 
on C*-algebra, 646 
permutation-invariant, 301, 326 
primary, 319 
probability measure, 28 
product, 243 
pure, 31 
quasi-free, 403 
singular, 112 
trivial at infinity, 366 
uncorrelated, 243 
state space, 28, 30, 763 
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normal total, 333 
of B(H), 43 
of B(A) sa, 43 
of C*-algebra, 334, 647 
pure, 334, 647 
states 
b-distinguishable, 446 
Stone spectrum, 781 
Stone’s Representation Theorem, 
780 
Stone’s Theorem, 184 
Stone—Weierstrass Theorem, 555 
strictly convex (normed space), 543 
strong (operator) topology, 574, 742 
strong continuity of group action, 
344 
structure constants, 97 
subcategory, 807 
full, 807 
subfunctor, 817 
subobject, 814 
subobject classifier, 462, 814 
in [C°P, Sets], 816 
subrepresentation, 319 
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support 
of function, 522 
of measure, 557 
supremum, 778 
symmetric sequence, 299 
symmetrization operator, 298 
symmetry, 125 
algebraic quantum theory, 
333-366 
Bohr, 127, 334 
Jordan, 126, 334 
Kadison, 126, 334 
Ludwig, 127, 334 
permutation, 275-288 
property of metric, 516 
quantum mechanics, 125-191 
spatial translation, 346 
spontaneously broken, 379 
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von Neumann, 127, 334 
weak Jordan, 126, 334 
weakly broken, 379 
Wigner, 126, 334 
symmetry group, 345 
symplectic manifold, 89 
system of imprimitivity, 258 


tangent bundle (as Lie algebroid), 
260 
target map, 806 
tautological functor, 464 
tautology, 785 
tempered distribution, 178 
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tensor product 
algebraic, 697 
C*-norm, 700 
cross-norm, 700 
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maximal C*-norm, 701 
product state, 703 
projective, 701, 772 
spatial, 243 
state, 702 
term, 794 
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in topos, 828 
tertium non datur, 75 
theorem, 786, 795 
theorem of the highest weight, 166 
theory 
fundamental, 367 
higher-level, 367 
reduced, 367 
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time-evolution, 345 
Tomita—Takesaki Theorem, 755 
Tomita—Takesaki theory, 754 
top element, 777, 820 
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topos 
definition, 815 
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topos theory 
and quantum logic, 459-494 
introduction to, 805-833 
total set of states, 115 
trace, 508, 751 
finite, 751 
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semifinite, 751 
transition probability, 31 
on P(B(H)), 47 
on pure state space, 765 
triangle inequality 
metric, 516 
norm, 495 
true, 76, 475, 802 
truth 
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in topos, 831 
partial, 831 
truth function, 785 
truth object, 461, 814 
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truth table, 785 
tubular neighbourhood theorem, 727 
twist map, 823 
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unitary representation, 151 
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upper semicontinuous partition, 336 
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bound, 794 
free, 794 
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cyclic, 595, 691 
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definition, 742 
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injective, 754 
maximal commutative, 2 
standard form, 755 
von Neumann chain, 437 
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546 
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